Statistics by subject – Statistical methods

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Type of information

2 facets displayed. 0 facets selected.

Year of publication

1 facets displayed. 1 facets selected.

Content

1 facets displayed. 0 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Type of information

2 facets displayed. 0 facets selected.

Year of publication

1 facets displayed. 1 facets selected.

Content

1 facets displayed. 0 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Year of publication

1 facets displayed. 1 facets selected.

Content

1 facets displayed. 0 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Year of publication

1 facets displayed. 1 facets selected.

Content

1 facets displayed. 0 facets selected.

Other available resources to support your research.

Help for sorting results
Browse our central repository of key standard concepts, definitions, data sources and methods.
Loading
Loading in progress, please wait...
All (118)

All (118) (25 of 118 results)

  • Articles and reports: 12-001-X200900211044
    Description:

    In large scaled sample surveys it is common practice to employ stratified multistage designs where units are selected using simple random sampling without replacement at each stage. Variance estimation for these types of designs can be quite cumbersome to implement, particularly for non-linear estimators. Various bootstrap methods for variance estimation have been proposed, but most of these are restricted to single-stage designs or two-stage cluster designs. An extension of the rescaled bootstrap method (Rao and Wu 1988) to stratified multistage designs is proposed which can easily be extended to any number of stages. The proposed method is suitable for a wide range of reweighting techniques, including the general class of calibration estimators. A Monte Carlo simulation study was conducted to examine the performance of the proposed multistage rescaled bootstrap variance estimator.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211042
    Description:

    This paper proposes an approach for small area prediction based on data obtained from periodic surveys and censuses. We apply our approach to obtain population predictions for the municipalities not sampled in the Brazilian annual Household Survey (PNAD), as well as to increase the precision of the design-based estimates obtained for the sampled municipalities. In addition to the data provided by the PNAD, we use census demographic data from 1991 and 2000, as well as a complete population count conducted in 1996. Hierarchically non-structured and spatially structured growth models that gain strength from all the sampled municipalities are proposed and compared.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211046
    Description:

    A semiparametric regression model is developed for complex surveys. In this model, the explanatory variables are represented separately as a nonparametric part and a parametric linear part. The estimation techniques combine nonparametric local polynomial regression estimation and least squares estimation. Asymptotic results such as consistency and normality of the estimators of regression coefficients and the regression functions have also been developed. Success of the performance of the methods and the properties of estimates have been shown by simulation and empirical examples with the Ontario Health Survey 1990.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211037
    Description:

    Randomized response strategies, which have originally been developed as statistical methods to reduce nonresponse as well as untruthful answering, can also be applied in the field of statistical disclosure control for public use microdata files. In this paper a standardization of randomized response techniques for the estimation of proportions of identifying or sensitive attributes is presented. The statistical properties of the standardized estimator are derived for general probability sampling. In order to analyse the effect of different choices of the method's implicit "design parameters" on the performance of the estimator we have to include measures of privacy protection in our considerations. These yield variance-optimum design parameters given a certain level of privacy protection. To this end the variables have to be classified into different categories of sensitivity. A real-data example applies the technique in a survey on academic cheating behaviour.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211043
    Description:

    Business surveys often use a one-stage stratified simple random sampling without replacement design with some certainty strata. Although weight adjustment is typically applied for unit nonresponse, the variability due to nonresponse may be omitted in practice when estimating variances. This is problematic especially when there are certainty strata. We derive some variance estimators that are consistent when the number of sampled units in each weighting cell is large, using the jackknife, linearization, and modified jackknife methods. The derived variance estimators are first applied to empirical data from the Annual Capital Expenditures Survey conducted by the U.S. Census Bureau and are then examined in a simulation study.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211039
    Description:

    Propensity weighting is a procedure to adjust for unit nonresponse in surveys. A form of implementing this procedure consists of dividing the sampling weights by estimates of the probabilities that the sampled units respond to the survey. Typically, these estimates are obtained by fitting parametric models, such as logistic regression. The resulting adjusted estimators may become biased when the specified parametric models are incorrect. To avoid misspecifying such a model, we consider nonparametric estimation of the response probabilities by local polynomial regression. We study the asymptotic properties of the resulting estimator under quasi-randomization. The practical behavior of the proposed nonresponse adjustment approach is evaluated on NHANES data.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211038
    Description:

    We examine overcoming the overestimation in using generalized weight share method (GWSM) caused by link nonresponse in indirect sampling. A few adjustment methods incorporating link nonresponse in using GWSM have been constructed for situations both with and without the availability of auxiliary variables. A simulation study on a longitudinal survey is presented using some of the adjustment methods we recommend. The simulation results show that these adjusted GWSMs perform well in reducing both estimation bias and variance. The advancement in bias reduction is significant.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211036
    Description:

    Surveys are frequently required to produce estimates for subpopulations, sometimes for a single subpopulation and sometimes for several subpopulations in addition to the total population. When membership of a rare subpopulation (or domain) can be determined from the sampling frame, selecting the required domain sample size is relatively straightforward. In this case the main issue is the extent of oversampling to employ when survey estimates are required for several domains and for the total population. Sampling and oversampling rare domains whose members cannot be identified in advance present a major challenge. A variety of methods has been used in this situation. In addition to large-scale screening, these methods include disproportionate stratified sampling, two-phase sampling, the use of multiple frames, multiplicity sampling, panel surveys, and the use of multi-purpose surveys. This paper illustrates the application of these methods in a range of social surveys.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211056
    Description:

    In this Issue is a column where the Editor biefly presents each paper of the current issue of Survey Methodology. As well, it sometimes contain informations on structure or management changes in the journal.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211040
    Description:

    In this paper a multivariate structural time series model is described that accounts for the panel design of the Dutch Labour Force Survey and is applied to estimate monthly unemployment rates. Compared to the generalized regression estimator, this approach results in a substantial increase of the accuracy due to a reduction of the standard error and the explicit modelling of the bias between the subsequent waves.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211041
    Description:

    Estimation of small area (or domain) compositions may suffer from informative missing data, if the probability of missing varies across the categories of interest as well as the small areas. We develop a double mixed modeling approach that combines a random effects mixed model for the underlying complete data with a random effects mixed model of the differential missing-data mechanism. The effect of sampling design can be incorporated through a quasi-likelihood sampling model. The associated conditional mean squared error of prediction is approximated in terms of a three-part decomposition, corresponding to a naive prediction variance, a positive correction that accounts for the hypothetical parameter estimation uncertainty based on the latent complete data, and another positive correction for the extra variation due to the missing data. We illustrate our approach with an application to the estimation of Municipality household compositions based on the Norwegian register household data, which suffer from informative under-registration of the dwelling identity number.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211045
    Description:

    In analysis of sample survey data, degrees-of-freedom quantities are often used to assess the stability of design-based variance estimators. For example, these degrees-of-freedom values are used in construction of confidence intervals based on t distribution approximations; and of related t tests. In addition, a small degrees-of-freedom term provides a qualitative indication of the possible limitations of a given variance estimator in a specific application. Degrees-of-freedom calculations sometimes are based on forms of the Satterthwaite approximation. These Satterthwaite-based calculations depend primarily on the relative magnitudes of stratum-level variances. However, for designs involving a small number of primary units selected per stratum, standard stratum-level variance estimators provide limited information on the true stratum variances. For such cases, customary Satterthwaite-based calculations can be problematic, especially in analyses for subpopulations that are concentrated in a relatively small number of strata. To address this problem, this paper uses estimated within-primary-sample-unit (within PSU) variances to provide auxiliary information regarding the relative magnitudes of the overall stratum-level variances. Analytic results indicate that the resulting degrees-of-freedom estimator will be better than modified Satterthwaite-type estimators provided: (a) the overall stratum-level variances are approximately proportional to the corresponding within-stratum variances; and (b) the variances of the within-PSU variance estimators are relatively small. In addition, this paper develops errors-in-variables methods that can be used to check conditions (a) and (b) empirically. For these model checks, we develop simulation-based reference distributions, which differ substantially from reference distributions based on customary large-sample normal approximations. The proposed methods are applied to four variables from the U.S. Third National Health and Nutrition Examination Survey (NHANES III).

    Release date: 2009-12-23

  • Technical products: 11-522-X2008000
    Description:

    Since 1984, an annual international symposium on methodological issues has been sponsored by Statistics Canada. Proceedings have been available since 1987. Symposium 2008 was the twenty fourth in Statistics Canada's series of international symposia on methodological issues. Each year the symposium focuses on a particular them. In 2008 the theme was: "Data Collection: Challenges, Achievements and New Directions".

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010962
    Description:

    The ÉLDEQ initiated a special data gathering project in March 2008 with the collection of biological materials from 1,973 families. During a typical visit, a nurse collects a blood or saliva sample from the selected child, makes a series of measurements (anthropometry, pulse rate and blood pressure) and administers questionnaires. Planned and supervised by the Institut de la Statistique du Québec (ISQ) and the Université de Montréal, the study is being conducted in cooperation with two private firms and a number of hospitals. This article examines the choice of collection methods, the division of effort among the various players, the sequence of communications and contacts with respondents, the tracing of families who are not contacted, and follow-up on the biological samples. Preliminary field results are also presented.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010973
    Description:

    The Canadian Community Health Survey (CCHS) provides timely estimates of health information at the sub-provincial level. We explore two main issues that prevented us from using physical activity data from CCHS cycle 3.1 (2005) as part of the Profile of Women's Health in Manitoba. CCHS uses the term 'moderate' to describe physical effort that meets Canadian minimum guidelines, whereas 'moderate' conversely describes sub-minimal levels of activity. A Manitoba survey of physical activity interrogates a wider variety of activities to measure respondents' daily energy expenditure. We found the latter survey better suited to our needs and more likely a better measure of women's daily physical activity and health.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010987
    Description:

    Over the last few years, there have been large progress in the web data collection area. Today, many statistical offices offer a web alternative in many different types of surveys. It is widely believed that web data collection may raise data quality while lowering data collection costs. Experience has shown that, offered web as a second alternative to paper questionnaires; enterprises have been slow to embrace the web alternative. On the other hand, experiments have also shown that by promoting web over paper, it is possible to raise the web take up rates. However, there are still few studies on what happens when the contact strategy is changed radically and the web option is the only option given in a complex enterprise survey. In 2008, Statistics Sweden took the step of using more or less a web-only strategy in the survey of industrial production (PRODCOM). The web questionnaire was developed in the generalised tool for web surveys used by Statistics Sweden. The paper presents the web solution and some experiences from the 2008 PRODCOM survey, including process data on response rates and error ratios as well as the results of a cognitive follow-up of the survey. Some important lessons learned are also presented.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010939
    Description:

    A year ago, Communications and Operations field initiated what is considered as Statistics Canada's first business architecture activity. This concerted effort was focused on collection related activities and processes, and was conducted over a short period during which over sixty STC senior and middle managers were consulted.

    We will introduce the discipline of business architecture, an approach based on "business blueprints" to interface between enterprise needs and its enabling solutions. We will describe the specific approach used to conduct Statistics Canada Collection Business Architecture, summarize the key lessons learned from this initiative, and provide an update on where we are and where we are heading.

    We will conclude by illustrating how this approach can serve as the genesis and foundation for an overall Statistics Canada business architecture.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010964
    Description:

    Statistics Netherlands (SN) has been using electronic questionnaires for Business surveys since the early nineties. Some years ago SN decided to invest in a large scale use of electronic questionnaires. The big yearly production survey of about 80 000 forms, divided over many different economical activity areas, was redesigned using a meta database driven approach. The resulting system is able to generate non-intelligent personalized PDF forms and intelligent personalized Blaise forms. The Blaise forms are used by a new tool in the Blaise system which can be downloaded by the respondents from the SN web site to run the questionnaire off-line. Essential to the system is the SN house style for paper and electronic forms. The flexibility of the new tool offered the questionnaire designers the possibility to implement a user friendly form according to this house style.

    Part of the implementation is an audit trail that offers insight in the way respondents operate the questionnaire program. The entered data including the audit trail can be transferred via encrypted e-mail or through the internet to SN. The paper will give an outline of the overall system architecture and the role of Blaise in the system. It will also describe the results of using the system for several years now and some results of the analysis of the audit trail.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010976
    Description:

    Many survey organizations use the response rate as an indicator for the quality of survey data. As a consequence, a variety of measures are implemented to reduce non-response or to maintain response at an acceptable level. However, the response rate is not necessarily a good indicator of non-response bias. A higher response rate does not imply smaller non-response bias. What matters is how the composition of the response differs from the composition of the sample as a whole. This paper describes the concept of R-indicators to assess potential differences between the sample and the response. Such indicators may facilitate analysis of survey response over time, between various fieldwork strategies or data collection modes. Some practical examples are given.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800011002
    Description:

    Based on a representative sample of the Canadian population, this article quantifies the bias resulting from the use of self-reported rather than directly measured height, weight and body mass index (BMI). Associations between BMI categories and selected health conditions are compared to see if the misclassification resulting from the use of self-reported data alters associations between obesity and obesity-related health conditions. The analysis is based on 4,567 respondents to the 2005 Canadian Community Health Survey (CCHS) who, during a face-to-face interview, provided self-reported values for height and weight and were then measured by trained interviewers. Based on self-reported data, a substantial proportion of individuals with excess body weight were erroneously placed in lower BMI categories. This misclassification resulted in elevated associations between overweight/obesity and morbidity.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010968
    Description:

    Statistics Canada has embarked on a program of increasing and improving the usage of imaging technology for paper survey questionnaires. The goal is to make the process an efficient, reliable and cost effective method of capturing survey data. The objective is to continue using Optical Character Recognition (OCR) to capture the data from questionnaires, documents and faxes received whilst improving the process integration and Quality Assurance/Quality Control (QC) of the data capture process. These improvements are discussed in this paper.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010959
    Description:

    The Unified Enterprise Survey (UES) at Statistics Canada is an annual business survey that unifies more than 60 surveys from different industries. Two types of collection follow-up score functions are currently used in the UES data collection. The objective of using a score function is to maximize the economically weighted response rates of the survey in terms of the primary variables of interest, under the constraint of a limited follow-up budget. Since the two types of score functions are based on different methodologies, they could have different impacts on the final estimates.

    This study generally compares the two types of score functions based on the collection data obtained from the two recent years. For comparison purposes, this study applies each score function method to the same data respectively and computes various estimates of the published financial and commodity variables, their deviation from the true pseudo value and their mean square deviation, based on each method. These estimates of deviation and mean square deviation based on each method are then used to measure the impact of each score function on the final estimates of the financial and commodity variables.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010975
    Description:

    A major issue in official statistics is the availability of objective measures supporting the based-on-fact decision process. Istat has developed an Information System to assess survey quality. Among other standard quality indicators, nonresponse rates are systematically computed and stored for all surveys. Such a rich information base permits analysis over time and comparisons among surveys. The paper focuses on the analysis of interrelationships between data collection mode and other survey characteristics on total nonresponse. Particular attention is devoted to the extent to which multi-mode data collection improves response rates.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010970
    Description:

    RTI International is currently conducting a longitudinal education study. One component of the study involved collecting transcripts and course catalogs from high schools that the sample members attended. Information from the transcripts and course catalogs also needed to be keyed and coded. This presented a challenge because the transcripts and course catalogs were collected from different types of schools, including public, private, and religious schools, from across the nation and they varied widely in both content and format. The challenge called for a sophisticated system that could be used by multiple users simultaneously. RTI developed such a system possessing all the characteristics of a high-end, high-tech, multi-user, multitask, user-friendly and low maintenance cost high school transcript and course catalog keying and coding system. The system is web based and has three major functions: transcript and catalog keying and coding, transcript and catalog keying quality control (keyer-coder end), and transcript and catalog coding QC (management end). Given the complex nature of transcript and catalog keying and coding, the system was designed to be flexible and to have the ability to transport keyed and coded data throughout the system to reduce the keying time, the ability to logically guide users through all the pages that a type of activity required, the ability to display appropriate information to help keying performance, and the ability to track all the keying, coding, and QC activities. Hundreds of catalogs and thousands of transcripts were successfully keyed, coded, and verified using the system. This paper will report on the system needs and design, implementation tips, problems faced and their solutions, and lessons learned.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010967
    Description:

    In this paper the background of the eXtensible Business Reporting Language and the involvement of Statistics Netherlands in the Dutch Taxonomy Project are discussed. The discussion predominantly focuses on the statistical context of using XBRL and the Dutch Taxonomy for expressing data terms to companies.

    Release date: 2009-12-03

Data (0)

Data (0) (0 results)

Your search for "" found no results in this section of the site.

You may try:

Analysis (24)

Analysis (24) (24 of 24 results)

  • Articles and reports: 12-001-X200900211044
    Description:

    In large scaled sample surveys it is common practice to employ stratified multistage designs where units are selected using simple random sampling without replacement at each stage. Variance estimation for these types of designs can be quite cumbersome to implement, particularly for non-linear estimators. Various bootstrap methods for variance estimation have been proposed, but most of these are restricted to single-stage designs or two-stage cluster designs. An extension of the rescaled bootstrap method (Rao and Wu 1988) to stratified multistage designs is proposed which can easily be extended to any number of stages. The proposed method is suitable for a wide range of reweighting techniques, including the general class of calibration estimators. A Monte Carlo simulation study was conducted to examine the performance of the proposed multistage rescaled bootstrap variance estimator.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211042
    Description:

    This paper proposes an approach for small area prediction based on data obtained from periodic surveys and censuses. We apply our approach to obtain population predictions for the municipalities not sampled in the Brazilian annual Household Survey (PNAD), as well as to increase the precision of the design-based estimates obtained for the sampled municipalities. In addition to the data provided by the PNAD, we use census demographic data from 1991 and 2000, as well as a complete population count conducted in 1996. Hierarchically non-structured and spatially structured growth models that gain strength from all the sampled municipalities are proposed and compared.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211046
    Description:

    A semiparametric regression model is developed for complex surveys. In this model, the explanatory variables are represented separately as a nonparametric part and a parametric linear part. The estimation techniques combine nonparametric local polynomial regression estimation and least squares estimation. Asymptotic results such as consistency and normality of the estimators of regression coefficients and the regression functions have also been developed. Success of the performance of the methods and the properties of estimates have been shown by simulation and empirical examples with the Ontario Health Survey 1990.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211037
    Description:

    Randomized response strategies, which have originally been developed as statistical methods to reduce nonresponse as well as untruthful answering, can also be applied in the field of statistical disclosure control for public use microdata files. In this paper a standardization of randomized response techniques for the estimation of proportions of identifying or sensitive attributes is presented. The statistical properties of the standardized estimator are derived for general probability sampling. In order to analyse the effect of different choices of the method's implicit "design parameters" on the performance of the estimator we have to include measures of privacy protection in our considerations. These yield variance-optimum design parameters given a certain level of privacy protection. To this end the variables have to be classified into different categories of sensitivity. A real-data example applies the technique in a survey on academic cheating behaviour.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211043
    Description:

    Business surveys often use a one-stage stratified simple random sampling without replacement design with some certainty strata. Although weight adjustment is typically applied for unit nonresponse, the variability due to nonresponse may be omitted in practice when estimating variances. This is problematic especially when there are certainty strata. We derive some variance estimators that are consistent when the number of sampled units in each weighting cell is large, using the jackknife, linearization, and modified jackknife methods. The derived variance estimators are first applied to empirical data from the Annual Capital Expenditures Survey conducted by the U.S. Census Bureau and are then examined in a simulation study.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211039
    Description:

    Propensity weighting is a procedure to adjust for unit nonresponse in surveys. A form of implementing this procedure consists of dividing the sampling weights by estimates of the probabilities that the sampled units respond to the survey. Typically, these estimates are obtained by fitting parametric models, such as logistic regression. The resulting adjusted estimators may become biased when the specified parametric models are incorrect. To avoid misspecifying such a model, we consider nonparametric estimation of the response probabilities by local polynomial regression. We study the asymptotic properties of the resulting estimator under quasi-randomization. The practical behavior of the proposed nonresponse adjustment approach is evaluated on NHANES data.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211038
    Description:

    We examine overcoming the overestimation in using generalized weight share method (GWSM) caused by link nonresponse in indirect sampling. A few adjustment methods incorporating link nonresponse in using GWSM have been constructed for situations both with and without the availability of auxiliary variables. A simulation study on a longitudinal survey is presented using some of the adjustment methods we recommend. The simulation results show that these adjusted GWSMs perform well in reducing both estimation bias and variance. The advancement in bias reduction is significant.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211036
    Description:

    Surveys are frequently required to produce estimates for subpopulations, sometimes for a single subpopulation and sometimes for several subpopulations in addition to the total population. When membership of a rare subpopulation (or domain) can be determined from the sampling frame, selecting the required domain sample size is relatively straightforward. In this case the main issue is the extent of oversampling to employ when survey estimates are required for several domains and for the total population. Sampling and oversampling rare domains whose members cannot be identified in advance present a major challenge. A variety of methods has been used in this situation. In addition to large-scale screening, these methods include disproportionate stratified sampling, two-phase sampling, the use of multiple frames, multiplicity sampling, panel surveys, and the use of multi-purpose surveys. This paper illustrates the application of these methods in a range of social surveys.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211056
    Description:

    In this Issue is a column where the Editor biefly presents each paper of the current issue of Survey Methodology. As well, it sometimes contain informations on structure or management changes in the journal.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211040
    Description:

    In this paper a multivariate structural time series model is described that accounts for the panel design of the Dutch Labour Force Survey and is applied to estimate monthly unemployment rates. Compared to the generalized regression estimator, this approach results in a substantial increase of the accuracy due to a reduction of the standard error and the explicit modelling of the bias between the subsequent waves.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211041
    Description:

    Estimation of small area (or domain) compositions may suffer from informative missing data, if the probability of missing varies across the categories of interest as well as the small areas. We develop a double mixed modeling approach that combines a random effects mixed model for the underlying complete data with a random effects mixed model of the differential missing-data mechanism. The effect of sampling design can be incorporated through a quasi-likelihood sampling model. The associated conditional mean squared error of prediction is approximated in terms of a three-part decomposition, corresponding to a naive prediction variance, a positive correction that accounts for the hypothetical parameter estimation uncertainty based on the latent complete data, and another positive correction for the extra variation due to the missing data. We illustrate our approach with an application to the estimation of Municipality household compositions based on the Norwegian register household data, which suffer from informative under-registration of the dwelling identity number.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211045
    Description:

    In analysis of sample survey data, degrees-of-freedom quantities are often used to assess the stability of design-based variance estimators. For example, these degrees-of-freedom values are used in construction of confidence intervals based on t distribution approximations; and of related t tests. In addition, a small degrees-of-freedom term provides a qualitative indication of the possible limitations of a given variance estimator in a specific application. Degrees-of-freedom calculations sometimes are based on forms of the Satterthwaite approximation. These Satterthwaite-based calculations depend primarily on the relative magnitudes of stratum-level variances. However, for designs involving a small number of primary units selected per stratum, standard stratum-level variance estimators provide limited information on the true stratum variances. For such cases, customary Satterthwaite-based calculations can be problematic, especially in analyses for subpopulations that are concentrated in a relatively small number of strata. To address this problem, this paper uses estimated within-primary-sample-unit (within PSU) variances to provide auxiliary information regarding the relative magnitudes of the overall stratum-level variances. Analytic results indicate that the resulting degrees-of-freedom estimator will be better than modified Satterthwaite-type estimators provided: (a) the overall stratum-level variances are approximately proportional to the corresponding within-stratum variances; and (b) the variances of the within-PSU variance estimators are relatively small. In addition, this paper develops errors-in-variables methods that can be used to check conditions (a) and (b) empirically. For these model checks, we develop simulation-based reference distributions, which differ substantially from reference distributions based on customary large-sample normal approximations. The proposed methods are applied to four variables from the U.S. Third National Health and Nutrition Examination Survey (NHANES III).

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900110880
    Description:

    This paper provides a framework for estimation by calibration in two phase sampling designs. This work grew out of the continuing development of generalized estimation software at Statistics Canada. An important objective in this development is to provide a wide range of options for effective use of auxiliary information in different sampling designs. This objective is reflected in the general methodology for two phase designs presented in this paper.

    We consider the traditional two phase sampling design. A phase one sample is drawn from the finite population and then a phase two sample is drawn as a sub sample of the first. The study variable, whose unknown population total is to be estimated, is observed only for the units in the phase two sample. Arbitrary sampling designs are allowed in each phase of sampling. Different types of auxiliary information are identified for the computation of the calibration weights at each phase. The auxiliary variables and the study variables can be continuous or categorical.

    The paper contributes to four important areas in the general context of calibration for two phase designs:(1) Three broad types of auxiliary information for two phase designs are identified and used in the estimation. The information is incorporated into the weights in two steps: a phase one calibration and a phase two calibration. We discuss the composition of the appropriate auxiliary vectors for each step, and use a linearization method to arrive at the residuals that determine the asymptotic variance of the calibration estimator.(2) We examine the effect of alternative choices of starting weights for the calibration. The two "natural" choices for the starting weights generally produce slightly different estimators. However, under certain conditions, these two estimators have the same asymptotic variance.(3) We re examine variance estimation for the two phase calibration estimator. A new procedure is proposed that can improve significantly on the usual technique of conditioning on the phase one sample. A simulation in section 10 serves to validate the advantage of this new method.(4) We compare the calibration approach with the traditional model assisted regression technique which uses a linear regression fit at two levels. We show that the model assisted estimator has properties similar to a two phase calibration estimator.

    Release date: 2009-06-22

  • Articles and reports: 12-001-X200900110883
    Description:

    We use a Bayesian method to resolve the boundary solution problem of the maximum likelihood (ML) estimate in an incomplete two-way contingency table, using a loglinear model and Dirichlet priors. We compare five Dirichlet priors in estimating multinomial cell probabilities under nonignorable nonresponse. Three priors among them have been used for an incomplete one-way table, while the remaining two new priors are newly proposed to reflect the difference in the response patterns between respondents and the undecided. The Bayesian estimates with the previous three priors do not always perform better than ML estimates unlike previous studies, whereas the two new priors perform better than both the previous three priors and the ML estimates whenever a boundary solution occurs. We use four sets of data from the 1998 Ohio state polls to illustrate how to use and interpret estimation results for the elections. We use simulation studies to compare performance of the five Bayesian estimates under nonignorable nonresponse.

    Release date: 2009-06-22

  • Articles and reports: 12-001-X200900110886
    Description:

    Interviewer variability is a major component of variability of survey statistics. Different strategies related to question formatting, question phrasing, interviewer training, interviewer workload, interviewer experience and interviewer assignment are employed in an effort to reduce interviewer variability. The traditional formula for measuring interviewer variability, commonly referred to as the interviewer effect, is given by ieff := deff_int = 1 + (n bar sub int - 1) rho sub int, where rho sub int and n bar sub int are the intra-interviewer correlation and the simple average of the interviewer workloads, respectively. In this article, we provide a model-assisted justification of this well-known formula for equal probability of selection methods (epsem) with no spatial clustering in the sample and equal interviewer workload. However, spatial clustering and unequal weighting are both very common in large scale surveys. In the context of a complex sampling design, we obtain an appropriate formula for the interviewer variability that takes into consideration unequal probability of selection and spatial clustering. Our formula provides a more accurate assessment of interviewer effects and thus is helpful in allocating more reasonable amount of funds to control the interviewer variability. We also propose a decomposition of the overall effect into effects due to weighting, spatial clustering and interviewers. Such a decomposition is helpful in understanding ways to reduce total variance by different means.

    Release date: 2009-06-22

  • Articles and reports: 12-001-X200900110884
    Description:

    The paper considers small domain estimation of the proportion of persons without health insurance for different minority groups. The small domains are cross-classified by age, sex and other demographic characteristics. Both hierarchical and empirical Bayes estimation methods are used. Also, second order accurate approximations of the mean squared errors of the empirical Bayes estimators and bias-corrected estimators of these mean squared errors are provided. The general methodology is illustrated with estimates of the proportion of uninsured persons for several cross-sections of the Asian subpopulation.

    Release date: 2009-06-22

  • Articles and reports: 12-001-X200900110887
    Description:

    Many survey organisations focus on the response rate as being the quality indicator for the impact of non-response bias. As a consequence, they implement a variety of measures to reduce non-response or to maintain response at some acceptable level. However, response rates alone are not good indicators of non-response bias. In general, higher response rates do not imply smaller non-response bias. The literature gives many examples of this (e.g., Groves and Peytcheva 2006, Keeter, Miller, Kohut, Groves and Presser 2000, Schouten 2004).

    We introduce a number of concepts and an indicator to assess the similarity between the response and the sample of a survey. Such quality indicators, which we call R-indicators, may serve as counterparts to survey response rates and are primarily directed at evaluating the non-response bias. These indicators may facilitate analysis of survey response over time, between various fieldwork strategies or data collection modes. We apply the R-indicators to two practical examples.

    Release date: 2009-06-22

  • Articles and reports: 12-001-X200900110881
    Description:

    Regression diagnostics are geared toward identifying individual points or groups of points that have an important influence on a fitted model. When fitting a model with survey data, the sources of influence are the response variable Y, the predictor variables X, and the survey weights, W. This article discusses the use of the hat matrix and leverages to identify points that may be influential in fitting linear models due to large weights or values of predictors. We also contrast findings that an analyst will obtain if ordinary least squares is used rather than survey weighted least squares to determine which points are influential.

    Release date: 2009-06-22

  • Articles and reports: 12-001-X200900110892
    Description:

    In this Issue is a column where the Editor biefly presents each paper of the current issue of Survey Methodology. As well, it sometimes contain informations on structure or management changes in the journal.

    Release date: 2009-06-22

  • Articles and reports: 12-001-X200900110885
    Description:

    Peaks in the spectrum of a stationary process are indicative of the presence of stochastic periodic phenomena, such as a stochastic seasonal effect. This work proposes to measure and test for the presence of such spectral peaks via assessing their aggregate slope and convexity. Our method is developed nonparametrically, and thus may be useful during a preliminary analysis of a series. The technique is also useful for detecting the presence of residual seasonality in seasonally adjusted data. The diagnostic is investigated through simulation and an extensive case study using data from the U.S. Census Bureau and the Organization for Economic Co-operation and Development (OECD).

    Release date: 2009-06-22

  • Articles and reports: 12-001-X200900110888
    Description:

    In the selection of a sample, a current practice is to define a sampling design stratified on subpopulations. This reduces the variance of the Horvitz-Thompson estimator in comparison with direct sampling if the strata are highly homogeneous with respect to the variable of interest. If auxiliary variables are available for each individual, sampling can be improved through balanced sampling within each stratum, and the Horvitz-Thompson estimator will be more precise if the auxiliary variables are strongly correlated with the variable of interest. However, if the sample allocation is small in some strata, balanced sampling will be only very approximate. In this paper, we propose a method of selecting a sample that is balanced across the entire population while maintaining a fixed allocation within each stratum. We show that in the important special case of size-2 sampling in each stratum, the precision of the Horvitz-Thompson estimator is improved if the variable of interest is well explained by balancing variables over the entire population. An application to rotational sampling is also presented.

    Release date: 2009-06-22

  • Articles and reports: 12-001-X200900110882
    Description:

    The bootstrap technique is becoming more and more popular in sample surveys conducted by national statistical agencies. In most of its implementations, several sets of bootstrap weights accompany the survey microdata file given to analysts. So far, the use of the technique in practice seems to have been mostly limited to variance estimation problems. In this paper, we propose a bootstrap methodology for testing hypotheses about a vector of unknown model parameters when the sample has been drawn from a finite population. The probability sampling design used to select the sample may be informative or not. Our method uses model-based test statistics that incorporate the survey weights. Such statistics are usually easily obtained using classical software packages. We approximate the distribution under the null hypothesis of these weighted model-based statistics by using bootstrap weights. An advantage of our bootstrap method over existing methods of hypothesis testing with survey data is that, once sets of bootstrap weights are provided to analysts, it is very easy to apply even when no specialized software dealing with complex surveys is available. Also, our simulation results suggest that, overall, it performs similarly to the Rao-Scott procedure and better than the Wald and Bonferroni procedures when testing hypotheses about a vector of linear regression model parameters.

    Release date: 2009-06-22

  • Articles and reports: 82-003-X200900110795
    Description:

    This article presents methods of combining cycles of the Canadian Community Health Survey and discusses issues to consider if these data are to be combined.

    Release date: 2009-02-18

  • Articles and reports: 91F0015M2008010
    Description:

    The objective of this study is to examine the feasibility of using provincial and territorial health care files of new registrants as an independent measure of preliminary inter-provincial and inter-territorial migration. The study aims at measuring the conceptual and quantifiable differences between this data source and our present source of the Canada Revenue Agency's Canadian Child Tax Benefit.

    Criteria were established to assess the quality and appropriateness of these provincial/territorial health care records as a proxy for our migration estimates: coverage, consistency, timeliness, reliability, level of detail, uniformity and accuracy.

    Based on the present analysis, the paper finds that these data do not ameliorate the estimates and would not be suitable at this time as a measure of inter-provincial/territorial migration. These Medicare data though are an important independent data source that can be used for quality evaluation.

    Release date: 2009-01-13

Reference (94)

Reference (94) (25 of 94 results)

  • Technical products: 11-522-X2008000
    Description:

    Since 1984, an annual international symposium on methodological issues has been sponsored by Statistics Canada. Proceedings have been available since 1987. Symposium 2008 was the twenty fourth in Statistics Canada's series of international symposia on methodological issues. Each year the symposium focuses on a particular them. In 2008 the theme was: "Data Collection: Challenges, Achievements and New Directions".

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010962
    Description:

    The ÉLDEQ initiated a special data gathering project in March 2008 with the collection of biological materials from 1,973 families. During a typical visit, a nurse collects a blood or saliva sample from the selected child, makes a series of measurements (anthropometry, pulse rate and blood pressure) and administers questionnaires. Planned and supervised by the Institut de la Statistique du Québec (ISQ) and the Université de Montréal, the study is being conducted in cooperation with two private firms and a number of hospitals. This article examines the choice of collection methods, the division of effort among the various players, the sequence of communications and contacts with respondents, the tracing of families who are not contacted, and follow-up on the biological samples. Preliminary field results are also presented.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010973
    Description:

    The Canadian Community Health Survey (CCHS) provides timely estimates of health information at the sub-provincial level. We explore two main issues that prevented us from using physical activity data from CCHS cycle 3.1 (2005) as part of the Profile of Women's Health in Manitoba. CCHS uses the term 'moderate' to describe physical effort that meets Canadian minimum guidelines, whereas 'moderate' conversely describes sub-minimal levels of activity. A Manitoba survey of physical activity interrogates a wider variety of activities to measure respondents' daily energy expenditure. We found the latter survey better suited to our needs and more likely a better measure of women's daily physical activity and health.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010987
    Description:

    Over the last few years, there have been large progress in the web data collection area. Today, many statistical offices offer a web alternative in many different types of surveys. It is widely believed that web data collection may raise data quality while lowering data collection costs. Experience has shown that, offered web as a second alternative to paper questionnaires; enterprises have been slow to embrace the web alternative. On the other hand, experiments have also shown that by promoting web over paper, it is possible to raise the web take up rates. However, there are still few studies on what happens when the contact strategy is changed radically and the web option is the only option given in a complex enterprise survey. In 2008, Statistics Sweden took the step of using more or less a web-only strategy in the survey of industrial production (PRODCOM). The web questionnaire was developed in the generalised tool for web surveys used by Statistics Sweden. The paper presents the web solution and some experiences from the 2008 PRODCOM survey, including process data on response rates and error ratios as well as the results of a cognitive follow-up of the survey. Some important lessons learned are also presented.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010939
    Description:

    A year ago, Communications and Operations field initiated what is considered as Statistics Canada's first business architecture activity. This concerted effort was focused on collection related activities and processes, and was conducted over a short period during which over sixty STC senior and middle managers were consulted.

    We will introduce the discipline of business architecture, an approach based on "business blueprints" to interface between enterprise needs and its enabling solutions. We will describe the specific approach used to conduct Statistics Canada Collection Business Architecture, summarize the key lessons learned from this initiative, and provide an update on where we are and where we are heading.

    We will conclude by illustrating how this approach can serve as the genesis and foundation for an overall Statistics Canada business architecture.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010964
    Description:

    Statistics Netherlands (SN) has been using electronic questionnaires for Business surveys since the early nineties. Some years ago SN decided to invest in a large scale use of electronic questionnaires. The big yearly production survey of about 80 000 forms, divided over many different economical activity areas, was redesigned using a meta database driven approach. The resulting system is able to generate non-intelligent personalized PDF forms and intelligent personalized Blaise forms. The Blaise forms are used by a new tool in the Blaise system which can be downloaded by the respondents from the SN web site to run the questionnaire off-line. Essential to the system is the SN house style for paper and electronic forms. The flexibility of the new tool offered the questionnaire designers the possibility to implement a user friendly form according to this house style.

    Part of the implementation is an audit trail that offers insight in the way respondents operate the questionnaire program. The entered data including the audit trail can be transferred via encrypted e-mail or through the internet to SN. The paper will give an outline of the overall system architecture and the role of Blaise in the system. It will also describe the results of using the system for several years now and some results of the analysis of the audit trail.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010976
    Description:

    Many survey organizations use the response rate as an indicator for the quality of survey data. As a consequence, a variety of measures are implemented to reduce non-response or to maintain response at an acceptable level. However, the response rate is not necessarily a good indicator of non-response bias. A higher response rate does not imply smaller non-response bias. What matters is how the composition of the response differs from the composition of the sample as a whole. This paper describes the concept of R-indicators to assess potential differences between the sample and the response. Such indicators may facilitate analysis of survey response over time, between various fieldwork strategies or data collection modes. Some practical examples are given.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800011002
    Description:

    Based on a representative sample of the Canadian population, this article quantifies the bias resulting from the use of self-reported rather than directly measured height, weight and body mass index (BMI). Associations between BMI categories and selected health conditions are compared to see if the misclassification resulting from the use of self-reported data alters associations between obesity and obesity-related health conditions. The analysis is based on 4,567 respondents to the 2005 Canadian Community Health Survey (CCHS) who, during a face-to-face interview, provided self-reported values for height and weight and were then measured by trained interviewers. Based on self-reported data, a substantial proportion of individuals with excess body weight were erroneously placed in lower BMI categories. This misclassification resulted in elevated associations between overweight/obesity and morbidity.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010968
    Description:

    Statistics Canada has embarked on a program of increasing and improving the usage of imaging technology for paper survey questionnaires. The goal is to make the process an efficient, reliable and cost effective method of capturing survey data. The objective is to continue using Optical Character Recognition (OCR) to capture the data from questionnaires, documents and faxes received whilst improving the process integration and Quality Assurance/Quality Control (QC) of the data capture process. These improvements are discussed in this paper.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010959
    Description:

    The Unified Enterprise Survey (UES) at Statistics Canada is an annual business survey that unifies more than 60 surveys from different industries. Two types of collection follow-up score functions are currently used in the UES data collection. The objective of using a score function is to maximize the economically weighted response rates of the survey in terms of the primary variables of interest, under the constraint of a limited follow-up budget. Since the two types of score functions are based on different methodologies, they could have different impacts on the final estimates.

    This study generally compares the two types of score functions based on the collection data obtained from the two recent years. For comparison purposes, this study applies each score function method to the same data respectively and computes various estimates of the published financial and commodity variables, their deviation from the true pseudo value and their mean square deviation, based on each method. These estimates of deviation and mean square deviation based on each method are then used to measure the impact of each score function on the final estimates of the financial and commodity variables.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010975
    Description:

    A major issue in official statistics is the availability of objective measures supporting the based-on-fact decision process. Istat has developed an Information System to assess survey quality. Among other standard quality indicators, nonresponse rates are systematically computed and stored for all surveys. Such a rich information base permits analysis over time and comparisons among surveys. The paper focuses on the analysis of interrelationships between data collection mode and other survey characteristics on total nonresponse. Particular attention is devoted to the extent to which multi-mode data collection improves response rates.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010970
    Description:

    RTI International is currently conducting a longitudinal education study. One component of the study involved collecting transcripts and course catalogs from high schools that the sample members attended. Information from the transcripts and course catalogs also needed to be keyed and coded. This presented a challenge because the transcripts and course catalogs were collected from different types of schools, including public, private, and religious schools, from across the nation and they varied widely in both content and format. The challenge called for a sophisticated system that could be used by multiple users simultaneously. RTI developed such a system possessing all the characteristics of a high-end, high-tech, multi-user, multitask, user-friendly and low maintenance cost high school transcript and course catalog keying and coding system. The system is web based and has three major functions: transcript and catalog keying and coding, transcript and catalog keying quality control (keyer-coder end), and transcript and catalog coding QC (management end). Given the complex nature of transcript and catalog keying and coding, the system was designed to be flexible and to have the ability to transport keyed and coded data throughout the system to reduce the keying time, the ability to logically guide users through all the pages that a type of activity required, the ability to display appropriate information to help keying performance, and the ability to track all the keying, coding, and QC activities. Hundreds of catalogs and thousands of transcripts were successfully keyed, coded, and verified using the system. This paper will report on the system needs and design, implementation tips, problems faced and their solutions, and lessons learned.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010967
    Description:

    In this paper the background of the eXtensible Business Reporting Language and the involvement of Statistics Netherlands in the Dutch Taxonomy Project are discussed. The discussion predominantly focuses on the statistical context of using XBRL and the Dutch Taxonomy for expressing data terms to companies.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010961
    Description:

    Increasingly, children of all ages are becoming respondents in survey interviews. While juveniles are considered to be reliable respondents for many topics and survey settings it is unclear to what extend younger children provide reliable information in a face-to-face interview. In this paper we will report results from a study using video captures of 205 face-to-face interviews with children aged 8 through 14. The interviews have been coded using behavior codes on a question by question level which provides behavior-related indicators regarding the question-answer process. In addition, standard tests of cognitive resources have been conducted. Using visible and audible problems in the respondent behavior, we are able to assess the impact of the children's cognitive resources on respondent behaviors. Results suggest that girls and boys differ fundamentally in the cognitive mechanisms leading to problematic respondent behaviors.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010992
    Description:

    The Canadian Community Health Survey (CCHS) was redesigned in 2007 so that it could use the continuous data collection method. Since then, a new sample has been selected every two months, and the data have also been collected over a two-month period. The survey uses two collection techniques: computer-assisted personal interviewing (CAPI) for the sample drawn from an area frame, and computer-assisted telephone interviewing (CATI) for the sample selected from a telephone list frame. Statistics Canada has recently implemented some data collection initiatives to reduce the response burden and survey costs while maintaining or improving data quality. The new measures include the use of a call management tool in the CATI system and a limit on the number of calls. They help manage telephone calls and limit the number of attempts made to contact a respondent. In addition, with the paradata that became available very recently, reports are now being generated to assist in evaluating and monitoring collection procedures and efficiency in real time. The CCHS has also been selected to implement further collection initiatives in the future. This paper provides a brief description of the survey, explains the advantages of continuous collection and outlines the impact that the new initiatives have had on the survey.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800011006
    Description:

    The Office for National Statistics (ONS) has an obligation to measure and annually report on the burden that it places on businesses participating in its surveys. There are also targets for reduction of costs to businesses complying with government regulation as part of the 2005 Administrative Burdens Reduction Project (ABRP) coordinated by the Better Regulation Executive (BRE).

    Respondent burden is measured by looking at the economic costs to businesses. Over time the methodology for measuring this economic cost has changed with the most recent method being the development and piloting of a Standard Cost Model (SCM) approach.

    The SCM is commonly used in Europe and is focused on measuring objective administrative burdens for all government requests for information e.g. tax returns, VAT, as well as survey participation. This method was not therefore specifically developed to measure statistical response burden. The SCM methodology is activity-based, meaning that the costs and time taken to fulfil requirements are broken down by activity.

    The SCM approach generally collects data using face-to-face interviews. The approach is therefore labour intensive both from a collection and analysis perspective but provides in depth information. The approach developed and piloted at ONS uses paper self-completion questionnaires.

    The objective of this paper is to provide an overview of respondent burden reporting and targets; and to review the different methodologies that ONS has used to measure respondent burden from the perspectives of sampling, data collection, analysis and usability.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010947
    Description:

    This paper addresses the efforts of the U.S. Energy Information Administration to design, test and implement new and substantially redesigned surveys. The need to change EIA's surveys has become increasingly important, as U.S. energy industries have moved from highly regulated to deregulated business. This has substantially affected both their ability and willingness to report data. The paper focuses on how EIA has deployed current tools for designing and testing surveys and the reasons that these methods have not always yielded the desired results. It suggests some new tools and methods that we would like to try to improve the quality of our data.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800011004
    Description:

    The issue of reducing the response burden is not new. Statistics Sweden works in different ways to reduce response burden and to decrease the administrative costs of data collection from enterprises and organizations. According to legislation Statistics Sweden must reduce response burden for the business community. Therefore, this work is a priority. There is a fixed level decided by the Government to decrease the administrative costs of enterprises by twenty-five percent until year 2010. This goal is valid also for data collection for statistical purposes. The goal concerns surveys with response compulsory legislation. In addition to these surveys there are many more surveys and a need to measure and reduce the burden from these surveys as well. In order to help measure, analyze and reduce the burden, Statistics Sweden has developed the Register of Data providers concerning enterprises and organization (ULR). The purpose of the register is twofold, to measure and analyze the burden on an aggregated level and to be able to give information to each individual enterprise which surveys they are participating in.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010974
    Description:

    This paper will focus on establishment survey questionnaire design guidelines. More specifically, it will discuss the process involved in transitioning a set of guidelines written for a broad, survey methodological audience to a more narrow, agency-specific audience of survey managers and analysts. The process involved the work of a team comprised of individuals from across the Census Bureau's Economic Directorate, working in a cooperative and collaborative manner. The team decided what needed to be added, modified, and deleted from the broad starting point, and determined how much of the theory and experimental evidence found in the literature was necessary to include in the guidelines. In addition to discussing the process, the paper will also describe the end result: a set of questionnaire design guidelines for the Economic Directorate.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010999
    Description:

    The choice of number of call attempts in a telephone survey is an important decision. A large number of call attempts makes the data collection costly and time-consuming; and a small number of attempts decreases the response set from which conclusions are drawn and increases the variance. The decision can also have an effect on the nonresponse bias. In this paper we study the effects of number of call attempts on the nonresponse rate and the nonresponse bias in two surveys conducted by Statistics Sweden: The Labour Force Survey (LFS) and Household Finances (HF).

    By use of paradata we calculate the response rate as a function of the number of call attempts. To estimate the nonresponse bias we use estimates of some register variables, where observations are available for both respondents and nonrespondents. We also calculate estimates of some real survey parameters as functions of varying number of call attempts. The results indicate that it is possible to reduce the current number of call attempts without getting an increased nonresponse bias.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010950
    Description:

    The next census will be conducted in May 2011. Being a major survey, it presents a formidable challenge for Statistics Canada and requires a great deal of time and resources. Careful planning has been done to ensure that all deadlines are met. A number of steps have been planned in the questionnaire testing process. These tests apply to both census content and the proposed communications strategy. This paper presents an overview of the strategy, with a focus on combining qualitative studies with the 2008 quantitative study so that the results can be analyzed and the proposals properly evaluated.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800011008
    Description:

    In one sense, a questionnaire is never complete. Test results, paradata and research findings constantly provide reasons to update and improve the questionnaire. In addition, establishments change over time and questions need to be updated accordingly. In reality, it doesn't always work like this. At Statistics Sweden there are several examples of questionnaires that were designed at one point in time and rarely improved later on. However, we are currently trying to shift the perspective on questionnaire design from a linear to a cyclic one. We are developing a cyclic model in which the questionnaire can be improved continuously in multiple rounds. In this presentation, we will discuss this model and how we work with it.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010988
    Description:

    Online data collection emerged in 1995 as an alternative approach for conducting certain types of consumer research studies and has grown in 2008. This growth has been primarily in studies where non-probability sampling methods are used. While online sampling has gained acceptance for some research applications, serious questions remain concerning online samples' suitability for research requiring precise volumetric measurement of the behavior of the U.S. population, particularly their travel behavior. This paper reviews literature and compares results from studies using probability samples and online samples to understand whether results differ from the two sampling approaches. The paper also demonstrates that online samples underestimate critical types of travel even after demographic and geographic weighting.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010948
    Description:

    Past survey instruments, whether in the form of a paper questionnaire or telephone script, were their own documentation. Based on this, the ESRC Question Bank was created, providing free-access internet publication of questionnaires, enabling researchers to re-use questions, saving them trouble, whilst improving the comparability of their data with that collected by others. Today however, as survey technology and computer programs have become more sophisticated, accurate comprehension of the latest questionnaires seems more difficult, particularly when each survey team uses its own conventions to document complex items in technical reports. This paper seeks to illustrate these problems and suggest preliminary standards of presentation to be used until the process can be automated.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010972
    Description:

    Background: Evaluation of the coverage that results from linking routinely collected administrative hospital data with survey data is an important preliminary step to undertaking analyses based on the linked file. Data and methods: To evaluate the coverage of the linkage between data from cycle 1.1 of the Canadian Community Health Survey (CCHS) and in-patient hospital data (Health Person-Oriented Information or HPOI), the number of people admitted to hospital according to HPOI was compared with the weighted estimate for CCHS respondents who were successfully linked to HPOI. Differences between HPOI and the linked and weighted CCHS estimate indicated linkage failure and/or undercoverage. Results: According to HPOI, from September 2000 through November 2001, 1,572,343 people (outside Quebec) aged 12 or older were hospitalized. Weighted estimates from the linked CCHS, adjusted for agreement to link and plausible health number, were 7.7% lower. Coverage rates were similar for males and females. Provincial rates did not differ from those for the rest of Canada, although differences were apparent for the territories. Coverage rates were significantly lower among people aged 75 or older than among those aged 12 to 74.

    Release date: 2009-12-03

Date modified: