Statistics by subject – Statistical methods

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Type of information

2 facets displayed. 0 facets selected.

Year of publication

1 facets displayed. 1 facets selected.

Content

1 facets displayed. 0 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Type of information

2 facets displayed. 0 facets selected.

Year of publication

1 facets displayed. 1 facets selected.

Content

1 facets displayed. 0 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Year of publication

1 facets displayed. 1 facets selected.

Content

1 facets displayed. 0 facets selected.

Filter results by

Help for filters and search
Currently selected filters that can be removed

Keyword(s)

Year of publication

1 facets displayed. 1 facets selected.

Content

1 facets displayed. 0 facets selected.

Other available resources to support your research.

Help for sorting results
Browse our central repository of key standard concepts, definitions, data sources and methods.
Loading
Loading in progress, please wait...
All (118)

All (118) (25 of 118 results)

  • Articles and reports: 12-001-X200900211041
    Description:

    Estimation of small area (or domain) compositions may suffer from informative missing data, if the probability of missing varies across the categories of interest as well as the small areas. We develop a double mixed modeling approach that combines a random effects mixed model for the underlying complete data with a random effects mixed model of the differential missing-data mechanism. The effect of sampling design can be incorporated through a quasi-likelihood sampling model. The associated conditional mean squared error of prediction is approximated in terms of a three-part decomposition, corresponding to a naive prediction variance, a positive correction that accounts for the hypothetical parameter estimation uncertainty based on the latent complete data, and another positive correction for the extra variation due to the missing data. We illustrate our approach with an application to the estimation of Municipality household compositions based on the Norwegian register household data, which suffer from informative under-registration of the dwelling identity number.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211044
    Description:

    In large scaled sample surveys it is common practice to employ stratified multistage designs where units are selected using simple random sampling without replacement at each stage. Variance estimation for these types of designs can be quite cumbersome to implement, particularly for non-linear estimators. Various bootstrap methods for variance estimation have been proposed, but most of these are restricted to single-stage designs or two-stage cluster designs. An extension of the rescaled bootstrap method (Rao and Wu 1988) to stratified multistage designs is proposed which can easily be extended to any number of stages. The proposed method is suitable for a wide range of reweighting techniques, including the general class of calibration estimators. A Monte Carlo simulation study was conducted to examine the performance of the proposed multistage rescaled bootstrap variance estimator.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211043
    Description:

    Business surveys often use a one-stage stratified simple random sampling without replacement design with some certainty strata. Although weight adjustment is typically applied for unit nonresponse, the variability due to nonresponse may be omitted in practice when estimating variances. This is problematic especially when there are certainty strata. We derive some variance estimators that are consistent when the number of sampled units in each weighting cell is large, using the jackknife, linearization, and modified jackknife methods. The derived variance estimators are first applied to empirical data from the Annual Capital Expenditures Survey conducted by the U.S. Census Bureau and are then examined in a simulation study.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211036
    Description:

    Surveys are frequently required to produce estimates for subpopulations, sometimes for a single subpopulation and sometimes for several subpopulations in addition to the total population. When membership of a rare subpopulation (or domain) can be determined from the sampling frame, selecting the required domain sample size is relatively straightforward. In this case the main issue is the extent of oversampling to employ when survey estimates are required for several domains and for the total population. Sampling and oversampling rare domains whose members cannot be identified in advance present a major challenge. A variety of methods has been used in this situation. In addition to large-scale screening, these methods include disproportionate stratified sampling, two-phase sampling, the use of multiple frames, multiplicity sampling, panel surveys, and the use of multi-purpose surveys. This paper illustrates the application of these methods in a range of social surveys.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211045
    Description:

    In analysis of sample survey data, degrees-of-freedom quantities are often used to assess the stability of design-based variance estimators. For example, these degrees-of-freedom values are used in construction of confidence intervals based on t distribution approximations; and of related t tests. In addition, a small degrees-of-freedom term provides a qualitative indication of the possible limitations of a given variance estimator in a specific application. Degrees-of-freedom calculations sometimes are based on forms of the Satterthwaite approximation. These Satterthwaite-based calculations depend primarily on the relative magnitudes of stratum-level variances. However, for designs involving a small number of primary units selected per stratum, standard stratum-level variance estimators provide limited information on the true stratum variances. For such cases, customary Satterthwaite-based calculations can be problematic, especially in analyses for subpopulations that are concentrated in a relatively small number of strata. To address this problem, this paper uses estimated within-primary-sample-unit (within PSU) variances to provide auxiliary information regarding the relative magnitudes of the overall stratum-level variances. Analytic results indicate that the resulting degrees-of-freedom estimator will be better than modified Satterthwaite-type estimators provided: (a) the overall stratum-level variances are approximately proportional to the corresponding within-stratum variances; and (b) the variances of the within-PSU variance estimators are relatively small. In addition, this paper develops errors-in-variables methods that can be used to check conditions (a) and (b) empirically. For these model checks, we develop simulation-based reference distributions, which differ substantially from reference distributions based on customary large-sample normal approximations. The proposed methods are applied to four variables from the U.S. Third National Health and Nutrition Examination Survey (NHANES III).

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211040
    Description:

    In this paper a multivariate structural time series model is described that accounts for the panel design of the Dutch Labour Force Survey and is applied to estimate monthly unemployment rates. Compared to the generalized regression estimator, this approach results in a substantial increase of the accuracy due to a reduction of the standard error and the explicit modelling of the bias between the subsequent waves.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211037
    Description:

    Randomized response strategies, which have originally been developed as statistical methods to reduce nonresponse as well as untruthful answering, can also be applied in the field of statistical disclosure control for public use microdata files. In this paper a standardization of randomized response techniques for the estimation of proportions of identifying or sensitive attributes is presented. The statistical properties of the standardized estimator are derived for general probability sampling. In order to analyse the effect of different choices of the method's implicit "design parameters" on the performance of the estimator we have to include measures of privacy protection in our considerations. These yield variance-optimum design parameters given a certain level of privacy protection. To this end the variables have to be classified into different categories of sensitivity. A real-data example applies the technique in a survey on academic cheating behaviour.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211042
    Description:

    This paper proposes an approach for small area prediction based on data obtained from periodic surveys and censuses. We apply our approach to obtain population predictions for the municipalities not sampled in the Brazilian annual Household Survey (PNAD), as well as to increase the precision of the design-based estimates obtained for the sampled municipalities. In addition to the data provided by the PNAD, we use census demographic data from 1991 and 2000, as well as a complete population count conducted in 1996. Hierarchically non-structured and spatially structured growth models that gain strength from all the sampled municipalities are proposed and compared.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211038
    Description:

    We examine overcoming the overestimation in using generalized weight share method (GWSM) caused by link nonresponse in indirect sampling. A few adjustment methods incorporating link nonresponse in using GWSM have been constructed for situations both with and without the availability of auxiliary variables. A simulation study on a longitudinal survey is presented using some of the adjustment methods we recommend. The simulation results show that these adjusted GWSMs perform well in reducing both estimation bias and variance. The advancement in bias reduction is significant.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211039
    Description:

    Propensity weighting is a procedure to adjust for unit nonresponse in surveys. A form of implementing this procedure consists of dividing the sampling weights by estimates of the probabilities that the sampled units respond to the survey. Typically, these estimates are obtained by fitting parametric models, such as logistic regression. The resulting adjusted estimators may become biased when the specified parametric models are incorrect. To avoid misspecifying such a model, we consider nonparametric estimation of the response probabilities by local polynomial regression. We study the asymptotic properties of the resulting estimator under quasi-randomization. The practical behavior of the proposed nonresponse adjustment approach is evaluated on NHANES data.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211056
    Description:

    In this Issue is a column where the Editor biefly presents each paper of the current issue of Survey Methodology. As well, it sometimes contain informations on structure or management changes in the journal.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211046
    Description:

    A semiparametric regression model is developed for complex surveys. In this model, the explanatory variables are represented separately as a nonparametric part and a parametric linear part. The estimation techniques combine nonparametric local polynomial regression estimation and least squares estimation. Asymptotic results such as consistency and normality of the estimators of regression coefficients and the regression functions have also been developed. Success of the performance of the methods and the properties of estimates have been shown by simulation and empirical examples with the Ontario Health Survey 1990.

    Release date: 2009-12-23

  • Technical products: 11-522-X2008000
    Description:

    Since 1984, an annual international symposium on methodological issues has been sponsored by Statistics Canada. Proceedings have been available since 1987. Symposium 2008 was the twenty fourth in Statistics Canada's series of international symposia on methodological issues. Each year the symposium focuses on a particular them. In 2008 the theme was: "Data Collection: Challenges, Achievements and New Directions".

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010972
    Description:

    Background: Evaluation of the coverage that results from linking routinely collected administrative hospital data with survey data is an important preliminary step to undertaking analyses based on the linked file. Data and methods: To evaluate the coverage of the linkage between data from cycle 1.1 of the Canadian Community Health Survey (CCHS) and in-patient hospital data (Health Person-Oriented Information or HPOI), the number of people admitted to hospital according to HPOI was compared with the weighted estimate for CCHS respondents who were successfully linked to HPOI. Differences between HPOI and the linked and weighted CCHS estimate indicated linkage failure and/or undercoverage. Results: According to HPOI, from September 2000 through November 2001, 1,572,343 people (outside Quebec) aged 12 or older were hospitalized. Weighted estimates from the linked CCHS, adjusted for agreement to link and plausible health number, were 7.7% lower. Coverage rates were similar for males and females. Provincial rates did not differ from those for the rest of Canada, although differences were apparent for the territories. Coverage rates were significantly lower among people aged 75 or older than among those aged 12 to 74.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010983
    Description:

    The US Census Bureau conducts monthly, quarterly, and annual surveys of the American economy and a census every 5 years. These programs require significant business effort. New technologies, new forms of organization, and scarce resources affect the ability of businesses to respond. Changes also affect what businesses expect from the Census Bureau, the Census Bureau's internal systems, and the way businesses interact with the Census Bureau.

    For several years, the Census Bureau has provided a special relationship to help large companies prepare for the census. We also have worked toward company-centric communication across all programs. A relationship model has emerged that focuses on infrastructure and business practices, and allows the Census Bureau to be more responsive.

    This paper focuses on the Census Bureau's company-centric communications and systems. We describe important initiatives and challenges, and we review their impact on Census Bureau practices and respondent behavior.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010955
    Description:

    Survey managers are still discovering the usefulness of digital audio recording for monitoring and managing field staff. Its value so far has been for confirming the authenticity of interviews, detecting curbstoning, offering a concrete basis for feedback on interviewing performance and giving data collection managers an intimate view of in-person interviews. In addition, computer audio-recorded interviewing (CARI) can improve other aspects of survey data quality, offering corroboration or correction of response coding by field staff. Audio recordings may replace or supplement in-field verbatim transcription of free responses, and speech-to-text technology might make this technique more efficient in the future.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010949
    Description:

    The expansion in scope of UK equality legislation has led to a requirement for data on sexual orientation. In response, the Office for National Statistics has initiated a project aiming to provide advice on best practice with regard to data collection in this field, and to examine the feasibility of providing data that will satisfy user needs. The project contains qualitative and quantitative research methodologies in relation to question development and survey operational issues. This includes:A review of UK and international surveys already collecting data on sexual orientation/identityA series of focus groups exploring conceptual issues surrounding "sexual identity" including related terms and the acceptability of questioning on multi-purpose household surveysA series of quantitative trials with particular attention to item non-response; question administration; and data collectionCognitively testing to ensure questioning was interpreted as intended.Quantitative research on potential bias issues in relation to proxy responsesFuture analysis and reporting issues are being considered alongside question development e.g. accurately capturing statistics on populations with low prevalence

    The presentation also discusses the practical survey administration issues relating to ensuring privacy in a concurrent interview situation, both face to face and over the telephone

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010979
    Description:

    Prior to 2006, the Canadian Census of Population relied on field staff to deliver questionnaires to all dwellings in Canada. For the 2006 Census, an address frame was created to cover almost 70% of dwellings in Canada, and these questionnaires were delivered by Canada Post. For the 2011 Census, Statistics Canada aims to expand this frame further, with a target of delivering questionnaires by mail to between 80% and 85% of dwellings. Mailing questionnaires for the Census raises a number of issues, among them: ensuring returned questionnaires are counted in the right area, creating an up to date address frame that includes all new growth, and determining which areas are unsuitable for having questionnaires delivered by mail. Changes to the address frame update procedures for 2011, most notably the decision to use purely administrative data as the frame wherever possible and conduct field update exercises only where deemed necessary, provide a new set of challenges for the 2011 Census.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010993
    Description:

    Until now, years of experience in questionnaire design were required to estimate how long it would take a respondent, on the average, to complete a CATI questionnaire for a new survey. This presentation focuses on a new method which produces interview time estimates for questionnaires at the development stage. The method uses Blaise Audit Trail data and previous surveys. It was developed, tested and verified for accuracy on some large scale surveys.

    First, audit trail data was used to determine the average time previous respondents have taken to answer specific types of questions. These would include questions that require a yes/no answer, scaled questions, "mark all that apply" questions, etc. Second, for any given questionnaire, the paths taken by population sub-groups were mapped to identify the series of questions answered by different types of respondents, and timed to determine what the longest possible interview time would be. Finally, the overall expected time it takes to complete the questionnaire is calculated using estimated proportions of the population expected to answer each question.

    So far, we used paradata to accurately estimate average respondent interview completion times. We note that the method that we developed could also be used to estimate specific respondent interview completion times.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010958
    Description:

    Telephone Data Entry (TDE) is a system by which survey respondents can return their data to the Office for National Statistics (ONS) using the keypad on their telephone and currently accounts for approximately 12% of total responses to ONS business surveys. ONS is currently increasing the number of surveys which use TDE as the primary mode of response and this paper gives an overview of the redevelopment project covering; the redevelopment of the paper questionnaire, enhancements made to the TDE system and the results from piloting these changes. Improvements to the quality of the data received and increased response via TDE as a result of these developments suggest that data quality improvements and cost savings are possible as a result of promoting TDE as the primary mode of response to short term surveys.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010941
    Description:

    Prior to 2004, the design and development of collection functions at Statistics New Zealand (Statistics NZ) was done by a centralised team of data collection methodologists. In 2004, an organisational review considered whether the design and development of these functions was being done in the most effective way. A key issue was the rising costs of surveying as the organisation moved from paper-based data collection to electronic data collection. The review saw some collection functions decentralised. However, a smaller centralised team of data collection methodologists was retained to work with subject matter areas across Statistics NZ.

    This paper will discuss the strategy used by the smaller centralised team of data collection methodologists to support subject matter areas. There are three key themes to the strategy. First, is the development of best practice standards and a central standards repository. Second, is training and introducing knowledge sharing forums. Third, is providing advice and independent review to subject matter areas which design and develop collection instruments.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010940
    Description:

    Data Collection Methodology (DCM) enable the collection of good quality data by providing expert advice and assistance on questionnaire design, methods of evaluation and respondent engagement. DCM assist in the development of client skills, undertake research and lead innovation in data collection methods. This is done in a challenging environment of organisational change and limited resources. This paper will cover 'how DCM do business' with clients and the wider methodological community to achieve our goals.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010920
    Description:

    On behalf of Statistics Canada, I would like to welcome you all, friends and colleagues, to Symposium 2008. This the 24th International Symposium organized by Statistics Canada on survey methodology.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010948
    Description:

    Past survey instruments, whether in the form of a paper questionnaire or telephone script, were their own documentation. Based on this, the ESRC Question Bank was created, providing free-access internet publication of questionnaires, enabling researchers to re-use questions, saving them trouble, whilst improving the comparability of their data with that collected by others. Today however, as survey technology and computer programs have become more sophisticated, accurate comprehension of the latest questionnaires seems more difficult, particularly when each survey team uses its own conventions to document complex items in technical reports. This paper seeks to illustrate these problems and suggest preliminary standards of presentation to be used until the process can be automated.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010937
    Description:

    The context of the discussion is the increasing incidence of international surveys, of which one is the International Tobacco Control (ITC) Policy Evaluation Project, which began in 2002. The ITC country surveys are longitudinal, and their aim is to evaluate the effects of policy measures being introduced in various countries under the WHO Framework Convention on Tobacco Control. The challenges of organization, data collection and analysis in international surveys are reviewed and illustrated. Analysis is an increasingly important part of the motivation for large scale cross-cultural surveys. The fundamental challenge for analysis is to discern the real response (or lack of response) to policy change, separating it from the effects of data collection mode, differential non-response, external events, time-in-sample, culture, and language. Two problems relevant to statistical analysis are discussed. The first problem is the question of when and how to analyze pooled data from several countries, in order to strengthen conclusions which might be generally valid. While in some cases this seems to be straightforward, there are differing opinions on the extent to which pooling is possible and reasonable. It is suggested that for formal comparisons, random effects models are of conceptual use. The second problem is to find models of measurement across cultures and data collection modes which will enable calibration of continuous, binary and ordinal responses, and produce comparisons from which extraneous effects have been removed. It is noted that hierarchical models provide a natural way of relaxing requirements of model invariance across groups.

    Release date: 2009-12-03

Data (0)

Data (0) (0 results)

Your search for "" found no results in this section of the site.

You may try:

Analysis (24)

Analysis (24) (24 of 24 results)

  • Articles and reports: 12-001-X200900211041
    Description:

    Estimation of small area (or domain) compositions may suffer from informative missing data, if the probability of missing varies across the categories of interest as well as the small areas. We develop a double mixed modeling approach that combines a random effects mixed model for the underlying complete data with a random effects mixed model of the differential missing-data mechanism. The effect of sampling design can be incorporated through a quasi-likelihood sampling model. The associated conditional mean squared error of prediction is approximated in terms of a three-part decomposition, corresponding to a naive prediction variance, a positive correction that accounts for the hypothetical parameter estimation uncertainty based on the latent complete data, and another positive correction for the extra variation due to the missing data. We illustrate our approach with an application to the estimation of Municipality household compositions based on the Norwegian register household data, which suffer from informative under-registration of the dwelling identity number.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211044
    Description:

    In large scaled sample surveys it is common practice to employ stratified multistage designs where units are selected using simple random sampling without replacement at each stage. Variance estimation for these types of designs can be quite cumbersome to implement, particularly for non-linear estimators. Various bootstrap methods for variance estimation have been proposed, but most of these are restricted to single-stage designs or two-stage cluster designs. An extension of the rescaled bootstrap method (Rao and Wu 1988) to stratified multistage designs is proposed which can easily be extended to any number of stages. The proposed method is suitable for a wide range of reweighting techniques, including the general class of calibration estimators. A Monte Carlo simulation study was conducted to examine the performance of the proposed multistage rescaled bootstrap variance estimator.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211043
    Description:

    Business surveys often use a one-stage stratified simple random sampling without replacement design with some certainty strata. Although weight adjustment is typically applied for unit nonresponse, the variability due to nonresponse may be omitted in practice when estimating variances. This is problematic especially when there are certainty strata. We derive some variance estimators that are consistent when the number of sampled units in each weighting cell is large, using the jackknife, linearization, and modified jackknife methods. The derived variance estimators are first applied to empirical data from the Annual Capital Expenditures Survey conducted by the U.S. Census Bureau and are then examined in a simulation study.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211036
    Description:

    Surveys are frequently required to produce estimates for subpopulations, sometimes for a single subpopulation and sometimes for several subpopulations in addition to the total population. When membership of a rare subpopulation (or domain) can be determined from the sampling frame, selecting the required domain sample size is relatively straightforward. In this case the main issue is the extent of oversampling to employ when survey estimates are required for several domains and for the total population. Sampling and oversampling rare domains whose members cannot be identified in advance present a major challenge. A variety of methods has been used in this situation. In addition to large-scale screening, these methods include disproportionate stratified sampling, two-phase sampling, the use of multiple frames, multiplicity sampling, panel surveys, and the use of multi-purpose surveys. This paper illustrates the application of these methods in a range of social surveys.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211045
    Description:

    In analysis of sample survey data, degrees-of-freedom quantities are often used to assess the stability of design-based variance estimators. For example, these degrees-of-freedom values are used in construction of confidence intervals based on t distribution approximations; and of related t tests. In addition, a small degrees-of-freedom term provides a qualitative indication of the possible limitations of a given variance estimator in a specific application. Degrees-of-freedom calculations sometimes are based on forms of the Satterthwaite approximation. These Satterthwaite-based calculations depend primarily on the relative magnitudes of stratum-level variances. However, for designs involving a small number of primary units selected per stratum, standard stratum-level variance estimators provide limited information on the true stratum variances. For such cases, customary Satterthwaite-based calculations can be problematic, especially in analyses for subpopulations that are concentrated in a relatively small number of strata. To address this problem, this paper uses estimated within-primary-sample-unit (within PSU) variances to provide auxiliary information regarding the relative magnitudes of the overall stratum-level variances. Analytic results indicate that the resulting degrees-of-freedom estimator will be better than modified Satterthwaite-type estimators provided: (a) the overall stratum-level variances are approximately proportional to the corresponding within-stratum variances; and (b) the variances of the within-PSU variance estimators are relatively small. In addition, this paper develops errors-in-variables methods that can be used to check conditions (a) and (b) empirically. For these model checks, we develop simulation-based reference distributions, which differ substantially from reference distributions based on customary large-sample normal approximations. The proposed methods are applied to four variables from the U.S. Third National Health and Nutrition Examination Survey (NHANES III).

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211040
    Description:

    In this paper a multivariate structural time series model is described that accounts for the panel design of the Dutch Labour Force Survey and is applied to estimate monthly unemployment rates. Compared to the generalized regression estimator, this approach results in a substantial increase of the accuracy due to a reduction of the standard error and the explicit modelling of the bias between the subsequent waves.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211037
    Description:

    Randomized response strategies, which have originally been developed as statistical methods to reduce nonresponse as well as untruthful answering, can also be applied in the field of statistical disclosure control for public use microdata files. In this paper a standardization of randomized response techniques for the estimation of proportions of identifying or sensitive attributes is presented. The statistical properties of the standardized estimator are derived for general probability sampling. In order to analyse the effect of different choices of the method's implicit "design parameters" on the performance of the estimator we have to include measures of privacy protection in our considerations. These yield variance-optimum design parameters given a certain level of privacy protection. To this end the variables have to be classified into different categories of sensitivity. A real-data example applies the technique in a survey on academic cheating behaviour.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211042
    Description:

    This paper proposes an approach for small area prediction based on data obtained from periodic surveys and censuses. We apply our approach to obtain population predictions for the municipalities not sampled in the Brazilian annual Household Survey (PNAD), as well as to increase the precision of the design-based estimates obtained for the sampled municipalities. In addition to the data provided by the PNAD, we use census demographic data from 1991 and 2000, as well as a complete population count conducted in 1996. Hierarchically non-structured and spatially structured growth models that gain strength from all the sampled municipalities are proposed and compared.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211038
    Description:

    We examine overcoming the overestimation in using generalized weight share method (GWSM) caused by link nonresponse in indirect sampling. A few adjustment methods incorporating link nonresponse in using GWSM have been constructed for situations both with and without the availability of auxiliary variables. A simulation study on a longitudinal survey is presented using some of the adjustment methods we recommend. The simulation results show that these adjusted GWSMs perform well in reducing both estimation bias and variance. The advancement in bias reduction is significant.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211039
    Description:

    Propensity weighting is a procedure to adjust for unit nonresponse in surveys. A form of implementing this procedure consists of dividing the sampling weights by estimates of the probabilities that the sampled units respond to the survey. Typically, these estimates are obtained by fitting parametric models, such as logistic regression. The resulting adjusted estimators may become biased when the specified parametric models are incorrect. To avoid misspecifying such a model, we consider nonparametric estimation of the response probabilities by local polynomial regression. We study the asymptotic properties of the resulting estimator under quasi-randomization. The practical behavior of the proposed nonresponse adjustment approach is evaluated on NHANES data.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211056
    Description:

    In this Issue is a column where the Editor biefly presents each paper of the current issue of Survey Methodology. As well, it sometimes contain informations on structure or management changes in the journal.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900211046
    Description:

    A semiparametric regression model is developed for complex surveys. In this model, the explanatory variables are represented separately as a nonparametric part and a parametric linear part. The estimation techniques combine nonparametric local polynomial regression estimation and least squares estimation. Asymptotic results such as consistency and normality of the estimators of regression coefficients and the regression functions have also been developed. Success of the performance of the methods and the properties of estimates have been shown by simulation and empirical examples with the Ontario Health Survey 1990.

    Release date: 2009-12-23

  • Articles and reports: 12-001-X200900110884
    Description:

    The paper considers small domain estimation of the proportion of persons without health insurance for different minority groups. The small domains are cross-classified by age, sex and other demographic characteristics. Both hierarchical and empirical Bayes estimation methods are used. Also, second order accurate approximations of the mean squared errors of the empirical Bayes estimators and bias-corrected estimators of these mean squared errors are provided. The general methodology is illustrated with estimates of the proportion of uninsured persons for several cross-sections of the Asian subpopulation.

    Release date: 2009-06-22

  • Articles and reports: 12-001-X200900110882
    Description:

    The bootstrap technique is becoming more and more popular in sample surveys conducted by national statistical agencies. In most of its implementations, several sets of bootstrap weights accompany the survey microdata file given to analysts. So far, the use of the technique in practice seems to have been mostly limited to variance estimation problems. In this paper, we propose a bootstrap methodology for testing hypotheses about a vector of unknown model parameters when the sample has been drawn from a finite population. The probability sampling design used to select the sample may be informative or not. Our method uses model-based test statistics that incorporate the survey weights. Such statistics are usually easily obtained using classical software packages. We approximate the distribution under the null hypothesis of these weighted model-based statistics by using bootstrap weights. An advantage of our bootstrap method over existing methods of hypothesis testing with survey data is that, once sets of bootstrap weights are provided to analysts, it is very easy to apply even when no specialized software dealing with complex surveys is available. Also, our simulation results suggest that, overall, it performs similarly to the Rao-Scott procedure and better than the Wald and Bonferroni procedures when testing hypotheses about a vector of linear regression model parameters.

    Release date: 2009-06-22

  • Articles and reports: 12-001-X200900110880
    Description:

    This paper provides a framework for estimation by calibration in two phase sampling designs. This work grew out of the continuing development of generalized estimation software at Statistics Canada. An important objective in this development is to provide a wide range of options for effective use of auxiliary information in different sampling designs. This objective is reflected in the general methodology for two phase designs presented in this paper.

    We consider the traditional two phase sampling design. A phase one sample is drawn from the finite population and then a phase two sample is drawn as a sub sample of the first. The study variable, whose unknown population total is to be estimated, is observed only for the units in the phase two sample. Arbitrary sampling designs are allowed in each phase of sampling. Different types of auxiliary information are identified for the computation of the calibration weights at each phase. The auxiliary variables and the study variables can be continuous or categorical.

    The paper contributes to four important areas in the general context of calibration for two phase designs:(1) Three broad types of auxiliary information for two phase designs are identified and used in the estimation. The information is incorporated into the weights in two steps: a phase one calibration and a phase two calibration. We discuss the composition of the appropriate auxiliary vectors for each step, and use a linearization method to arrive at the residuals that determine the asymptotic variance of the calibration estimator.(2) We examine the effect of alternative choices of starting weights for the calibration. The two "natural" choices for the starting weights generally produce slightly different estimators. However, under certain conditions, these two estimators have the same asymptotic variance.(3) We re examine variance estimation for the two phase calibration estimator. A new procedure is proposed that can improve significantly on the usual technique of conditioning on the phase one sample. A simulation in section 10 serves to validate the advantage of this new method.(4) We compare the calibration approach with the traditional model assisted regression technique which uses a linear regression fit at two levels. We show that the model assisted estimator has properties similar to a two phase calibration estimator.

    Release date: 2009-06-22

  • Articles and reports: 12-001-X200900110888
    Description:

    In the selection of a sample, a current practice is to define a sampling design stratified on subpopulations. This reduces the variance of the Horvitz-Thompson estimator in comparison with direct sampling if the strata are highly homogeneous with respect to the variable of interest. If auxiliary variables are available for each individual, sampling can be improved through balanced sampling within each stratum, and the Horvitz-Thompson estimator will be more precise if the auxiliary variables are strongly correlated with the variable of interest. However, if the sample allocation is small in some strata, balanced sampling will be only very approximate. In this paper, we propose a method of selecting a sample that is balanced across the entire population while maintaining a fixed allocation within each stratum. We show that in the important special case of size-2 sampling in each stratum, the precision of the Horvitz-Thompson estimator is improved if the variable of interest is well explained by balancing variables over the entire population. An application to rotational sampling is also presented.

    Release date: 2009-06-22

  • Articles and reports: 12-001-X200900110881
    Description:

    Regression diagnostics are geared toward identifying individual points or groups of points that have an important influence on a fitted model. When fitting a model with survey data, the sources of influence are the response variable Y, the predictor variables X, and the survey weights, W. This article discusses the use of the hat matrix and leverages to identify points that may be influential in fitting linear models due to large weights or values of predictors. We also contrast findings that an analyst will obtain if ordinary least squares is used rather than survey weighted least squares to determine which points are influential.

    Release date: 2009-06-22

  • Articles and reports: 12-001-X200900110887
    Description:

    Many survey organisations focus on the response rate as being the quality indicator for the impact of non-response bias. As a consequence, they implement a variety of measures to reduce non-response or to maintain response at some acceptable level. However, response rates alone are not good indicators of non-response bias. In general, higher response rates do not imply smaller non-response bias. The literature gives many examples of this (e.g., Groves and Peytcheva 2006, Keeter, Miller, Kohut, Groves and Presser 2000, Schouten 2004).

    We introduce a number of concepts and an indicator to assess the similarity between the response and the sample of a survey. Such quality indicators, which we call R-indicators, may serve as counterparts to survey response rates and are primarily directed at evaluating the non-response bias. These indicators may facilitate analysis of survey response over time, between various fieldwork strategies or data collection modes. We apply the R-indicators to two practical examples.

    Release date: 2009-06-22

  • Articles and reports: 12-001-X200900110885
    Description:

    Peaks in the spectrum of a stationary process are indicative of the presence of stochastic periodic phenomena, such as a stochastic seasonal effect. This work proposes to measure and test for the presence of such spectral peaks via assessing their aggregate slope and convexity. Our method is developed nonparametrically, and thus may be useful during a preliminary analysis of a series. The technique is also useful for detecting the presence of residual seasonality in seasonally adjusted data. The diagnostic is investigated through simulation and an extensive case study using data from the U.S. Census Bureau and the Organization for Economic Co-operation and Development (OECD).

    Release date: 2009-06-22

  • Articles and reports: 12-001-X200900110886
    Description:

    Interviewer variability is a major component of variability of survey statistics. Different strategies related to question formatting, question phrasing, interviewer training, interviewer workload, interviewer experience and interviewer assignment are employed in an effort to reduce interviewer variability. The traditional formula for measuring interviewer variability, commonly referred to as the interviewer effect, is given by ieff := deff_int = 1 + (n bar sub int - 1) rho sub int, where rho sub int and n bar sub int are the intra-interviewer correlation and the simple average of the interviewer workloads, respectively. In this article, we provide a model-assisted justification of this well-known formula for equal probability of selection methods (epsem) with no spatial clustering in the sample and equal interviewer workload. However, spatial clustering and unequal weighting are both very common in large scale surveys. In the context of a complex sampling design, we obtain an appropriate formula for the interviewer variability that takes into consideration unequal probability of selection and spatial clustering. Our formula provides a more accurate assessment of interviewer effects and thus is helpful in allocating more reasonable amount of funds to control the interviewer variability. We also propose a decomposition of the overall effect into effects due to weighting, spatial clustering and interviewers. Such a decomposition is helpful in understanding ways to reduce total variance by different means.

    Release date: 2009-06-22

  • Articles and reports: 12-001-X200900110883
    Description:

    We use a Bayesian method to resolve the boundary solution problem of the maximum likelihood (ML) estimate in an incomplete two-way contingency table, using a loglinear model and Dirichlet priors. We compare five Dirichlet priors in estimating multinomial cell probabilities under nonignorable nonresponse. Three priors among them have been used for an incomplete one-way table, while the remaining two new priors are newly proposed to reflect the difference in the response patterns between respondents and the undecided. The Bayesian estimates with the previous three priors do not always perform better than ML estimates unlike previous studies, whereas the two new priors perform better than both the previous three priors and the ML estimates whenever a boundary solution occurs. We use four sets of data from the 1998 Ohio state polls to illustrate how to use and interpret estimation results for the elections. We use simulation studies to compare performance of the five Bayesian estimates under nonignorable nonresponse.

    Release date: 2009-06-22

  • Articles and reports: 12-001-X200900110892
    Description:

    In this Issue is a column where the Editor biefly presents each paper of the current issue of Survey Methodology. As well, it sometimes contain informations on structure or management changes in the journal.

    Release date: 2009-06-22

  • Articles and reports: 82-003-X200900110795
    Description:

    This article presents methods of combining cycles of the Canadian Community Health Survey and discusses issues to consider if these data are to be combined.

    Release date: 2009-02-18

  • Articles and reports: 91F0015M2008010
    Description:

    The objective of this study is to examine the feasibility of using provincial and territorial health care files of new registrants as an independent measure of preliminary inter-provincial and inter-territorial migration. The study aims at measuring the conceptual and quantifiable differences between this data source and our present source of the Canada Revenue Agency's Canadian Child Tax Benefit.

    Criteria were established to assess the quality and appropriateness of these provincial/territorial health care records as a proxy for our migration estimates: coverage, consistency, timeliness, reliability, level of detail, uniformity and accuracy.

    Based on the present analysis, the paper finds that these data do not ameliorate the estimates and would not be suitable at this time as a measure of inter-provincial/territorial migration. These Medicare data though are an important independent data source that can be used for quality evaluation.

    Release date: 2009-01-13

Reference (94)

Reference (94) (25 of 94 results)

  • Technical products: 11-522-X2008000
    Description:

    Since 1984, an annual international symposium on methodological issues has been sponsored by Statistics Canada. Proceedings have been available since 1987. Symposium 2008 was the twenty fourth in Statistics Canada's series of international symposia on methodological issues. Each year the symposium focuses on a particular them. In 2008 the theme was: "Data Collection: Challenges, Achievements and New Directions".

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010972
    Description:

    Background: Evaluation of the coverage that results from linking routinely collected administrative hospital data with survey data is an important preliminary step to undertaking analyses based on the linked file. Data and methods: To evaluate the coverage of the linkage between data from cycle 1.1 of the Canadian Community Health Survey (CCHS) and in-patient hospital data (Health Person-Oriented Information or HPOI), the number of people admitted to hospital according to HPOI was compared with the weighted estimate for CCHS respondents who were successfully linked to HPOI. Differences between HPOI and the linked and weighted CCHS estimate indicated linkage failure and/or undercoverage. Results: According to HPOI, from September 2000 through November 2001, 1,572,343 people (outside Quebec) aged 12 or older were hospitalized. Weighted estimates from the linked CCHS, adjusted for agreement to link and plausible health number, were 7.7% lower. Coverage rates were similar for males and females. Provincial rates did not differ from those for the rest of Canada, although differences were apparent for the territories. Coverage rates were significantly lower among people aged 75 or older than among those aged 12 to 74.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010983
    Description:

    The US Census Bureau conducts monthly, quarterly, and annual surveys of the American economy and a census every 5 years. These programs require significant business effort. New technologies, new forms of organization, and scarce resources affect the ability of businesses to respond. Changes also affect what businesses expect from the Census Bureau, the Census Bureau's internal systems, and the way businesses interact with the Census Bureau.

    For several years, the Census Bureau has provided a special relationship to help large companies prepare for the census. We also have worked toward company-centric communication across all programs. A relationship model has emerged that focuses on infrastructure and business practices, and allows the Census Bureau to be more responsive.

    This paper focuses on the Census Bureau's company-centric communications and systems. We describe important initiatives and challenges, and we review their impact on Census Bureau practices and respondent behavior.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010955
    Description:

    Survey managers are still discovering the usefulness of digital audio recording for monitoring and managing field staff. Its value so far has been for confirming the authenticity of interviews, detecting curbstoning, offering a concrete basis for feedback on interviewing performance and giving data collection managers an intimate view of in-person interviews. In addition, computer audio-recorded interviewing (CARI) can improve other aspects of survey data quality, offering corroboration or correction of response coding by field staff. Audio recordings may replace or supplement in-field verbatim transcription of free responses, and speech-to-text technology might make this technique more efficient in the future.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010949
    Description:

    The expansion in scope of UK equality legislation has led to a requirement for data on sexual orientation. In response, the Office for National Statistics has initiated a project aiming to provide advice on best practice with regard to data collection in this field, and to examine the feasibility of providing data that will satisfy user needs. The project contains qualitative and quantitative research methodologies in relation to question development and survey operational issues. This includes:A review of UK and international surveys already collecting data on sexual orientation/identityA series of focus groups exploring conceptual issues surrounding "sexual identity" including related terms and the acceptability of questioning on multi-purpose household surveysA series of quantitative trials with particular attention to item non-response; question administration; and data collectionCognitively testing to ensure questioning was interpreted as intended.Quantitative research on potential bias issues in relation to proxy responsesFuture analysis and reporting issues are being considered alongside question development e.g. accurately capturing statistics on populations with low prevalence

    The presentation also discusses the practical survey administration issues relating to ensuring privacy in a concurrent interview situation, both face to face and over the telephone

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010979
    Description:

    Prior to 2006, the Canadian Census of Population relied on field staff to deliver questionnaires to all dwellings in Canada. For the 2006 Census, an address frame was created to cover almost 70% of dwellings in Canada, and these questionnaires were delivered by Canada Post. For the 2011 Census, Statistics Canada aims to expand this frame further, with a target of delivering questionnaires by mail to between 80% and 85% of dwellings. Mailing questionnaires for the Census raises a number of issues, among them: ensuring returned questionnaires are counted in the right area, creating an up to date address frame that includes all new growth, and determining which areas are unsuitable for having questionnaires delivered by mail. Changes to the address frame update procedures for 2011, most notably the decision to use purely administrative data as the frame wherever possible and conduct field update exercises only where deemed necessary, provide a new set of challenges for the 2011 Census.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010993
    Description:

    Until now, years of experience in questionnaire design were required to estimate how long it would take a respondent, on the average, to complete a CATI questionnaire for a new survey. This presentation focuses on a new method which produces interview time estimates for questionnaires at the development stage. The method uses Blaise Audit Trail data and previous surveys. It was developed, tested and verified for accuracy on some large scale surveys.

    First, audit trail data was used to determine the average time previous respondents have taken to answer specific types of questions. These would include questions that require a yes/no answer, scaled questions, "mark all that apply" questions, etc. Second, for any given questionnaire, the paths taken by population sub-groups were mapped to identify the series of questions answered by different types of respondents, and timed to determine what the longest possible interview time would be. Finally, the overall expected time it takes to complete the questionnaire is calculated using estimated proportions of the population expected to answer each question.

    So far, we used paradata to accurately estimate average respondent interview completion times. We note that the method that we developed could also be used to estimate specific respondent interview completion times.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010958
    Description:

    Telephone Data Entry (TDE) is a system by which survey respondents can return their data to the Office for National Statistics (ONS) using the keypad on their telephone and currently accounts for approximately 12% of total responses to ONS business surveys. ONS is currently increasing the number of surveys which use TDE as the primary mode of response and this paper gives an overview of the redevelopment project covering; the redevelopment of the paper questionnaire, enhancements made to the TDE system and the results from piloting these changes. Improvements to the quality of the data received and increased response via TDE as a result of these developments suggest that data quality improvements and cost savings are possible as a result of promoting TDE as the primary mode of response to short term surveys.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010941
    Description:

    Prior to 2004, the design and development of collection functions at Statistics New Zealand (Statistics NZ) was done by a centralised team of data collection methodologists. In 2004, an organisational review considered whether the design and development of these functions was being done in the most effective way. A key issue was the rising costs of surveying as the organisation moved from paper-based data collection to electronic data collection. The review saw some collection functions decentralised. However, a smaller centralised team of data collection methodologists was retained to work with subject matter areas across Statistics NZ.

    This paper will discuss the strategy used by the smaller centralised team of data collection methodologists to support subject matter areas. There are three key themes to the strategy. First, is the development of best practice standards and a central standards repository. Second, is training and introducing knowledge sharing forums. Third, is providing advice and independent review to subject matter areas which design and develop collection instruments.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010940
    Description:

    Data Collection Methodology (DCM) enable the collection of good quality data by providing expert advice and assistance on questionnaire design, methods of evaluation and respondent engagement. DCM assist in the development of client skills, undertake research and lead innovation in data collection methods. This is done in a challenging environment of organisational change and limited resources. This paper will cover 'how DCM do business' with clients and the wider methodological community to achieve our goals.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010920
    Description:

    On behalf of Statistics Canada, I would like to welcome you all, friends and colleagues, to Symposium 2008. This the 24th International Symposium organized by Statistics Canada on survey methodology.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010948
    Description:

    Past survey instruments, whether in the form of a paper questionnaire or telephone script, were their own documentation. Based on this, the ESRC Question Bank was created, providing free-access internet publication of questionnaires, enabling researchers to re-use questions, saving them trouble, whilst improving the comparability of their data with that collected by others. Today however, as survey technology and computer programs have become more sophisticated, accurate comprehension of the latest questionnaires seems more difficult, particularly when each survey team uses its own conventions to document complex items in technical reports. This paper seeks to illustrate these problems and suggest preliminary standards of presentation to be used until the process can be automated.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010937
    Description:

    The context of the discussion is the increasing incidence of international surveys, of which one is the International Tobacco Control (ITC) Policy Evaluation Project, which began in 2002. The ITC country surveys are longitudinal, and their aim is to evaluate the effects of policy measures being introduced in various countries under the WHO Framework Convention on Tobacco Control. The challenges of organization, data collection and analysis in international surveys are reviewed and illustrated. Analysis is an increasingly important part of the motivation for large scale cross-cultural surveys. The fundamental challenge for analysis is to discern the real response (or lack of response) to policy change, separating it from the effects of data collection mode, differential non-response, external events, time-in-sample, culture, and language. Two problems relevant to statistical analysis are discussed. The first problem is the question of when and how to analyze pooled data from several countries, in order to strengthen conclusions which might be generally valid. While in some cases this seems to be straightforward, there are differing opinions on the extent to which pooling is possible and reasonable. It is suggested that for formal comparisons, random effects models are of conceptual use. The second problem is to find models of measurement across cultures and data collection modes which will enable calibration of continuous, binary and ordinal responses, and produce comparisons from which extraneous effects have been removed. It is noted that hierarchical models provide a natural way of relaxing requirements of model invariance across groups.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800011013
    Description:

    Collecting data using audio recordings for interviewing can be an effective and versatile data collection tool. These recordings however can lead to large files which are cumbersome to manage. Technological developments including better audio software development tools and increased adoption of broadband connections has eased the burden in the collection of audio data. This paper focuses on technologies and techniques used to record and manage audio collected surveys using laptops, telephones and internet connections. The process outlined involves devices connecting directly to the phone receiver which streams conversations directly to the laptop for storage and transmission.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010984
    Description:

    The Enterprise Portfolio Manager (EPM) Program at Statistics Canada demonstrated the value of employing a "holistic" approach to managing the relationships we have with our largest and most complex business respondents.

    Understanding that different types of respondents should receive different levels of intervention and having learnt the value of employing an "enterprise-centric" approach to managing relationships with important, complex data providers, STC has embraced a response management strategy that divides its business population into four tiers based on size, complexity and importance to survey estimates. Thus segmented, different response management approaches have been developed appropriate to the relative contribution of the segment. This allows STC to target resources to the areas where it stands to achieve the greatest return on investment. Tier I and Tier II have been defined as critical to survey estimates.

    Tier I represent the largest, most complex businesses in Canada and is managed through the Enterprise Portfolio Management Program.

    Tier II represents businesses that are smaller or less complex than Tier I but still significant in developing accurate measures of the activities of individual industries.

    Tier III includes more medium-sized businesses, those that form the bulk of survey samples.

    Tier IV represents the smallest businesses which are excluded from collection; for these STC relies entirely on tax information.

    The presentation will outline:It works! Results and metrics from the programs that have operationalized the Holistic Response Management strategy.Developing a less subjective, methodological approach to segment the business survey population for HRM. The project team's work to capture the complexity factors intrinsically used by experienced staff to rank respondents. What our so called "problem" respondents have told us about the issues underlying non-response.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800011016
    Description:

    Now that we have come to the end of a day of workshops plus three very full days of sessions, I have the very pleasant task of offering a few closing remarks and, more importantly, of recognizing the efforts of those who have contributed to the success of this year's symposium. And it has clearly been a success.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010988
    Description:

    Online data collection emerged in 1995 as an alternative approach for conducting certain types of consumer research studies and has grown in 2008. This growth has been primarily in studies where non-probability sampling methods are used. While online sampling has gained acceptance for some research applications, serious questions remain concerning online samples' suitability for research requiring precise volumetric measurement of the behavior of the U.S. population, particularly their travel behavior. This paper reviews literature and compares results from studies using probability samples and online samples to understand whether results differ from the two sampling approaches. The paper also demonstrates that online samples underestimate critical types of travel even after demographic and geographic weighting.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010974
    Description:

    This paper will focus on establishment survey questionnaire design guidelines. More specifically, it will discuss the process involved in transitioning a set of guidelines written for a broad, survey methodological audience to a more narrow, agency-specific audience of survey managers and analysts. The process involved the work of a team comprised of individuals from across the Census Bureau's Economic Directorate, working in a cooperative and collaborative manner. The team decided what needed to be added, modified, and deleted from the broad starting point, and determined how much of the theory and experimental evidence found in the literature was necessary to include in the guidelines. In addition to discussing the process, the paper will also describe the end result: a set of questionnaire design guidelines for the Economic Directorate.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800011010
    Description:

    The Survey of Employment, Payrolls and Hours (SEPH) is a monthly survey using two sources of data: a census of payroll deduction (PD7) forms (administrative data) and a survey of business establishments. This paper focuses on the processing of the administrative data, from the weekly receipt of data from the Canada Revenue Agency to the production of monthly estimates produced by SEPH.

    The edit and imputation methods used to process the administrative data have been revised in the last several years. The goals of this redesign were primarily to improve the data quality and to increase the consistency with another administrative data source (T4) which is a benchmark measure for Statistics Canada's System of National Accounts people. An additional goal was to ensure that the new process would be easier to understand and to modify, if needed. As a result, a new processing module was developed to edit and impute PD7 forms before their data is aggregated to the monthly level.

    This paper presents an overview of both the current and new processes, including a description of challenges that we faced during development. Improved quality is demonstrated both conceptually (by presenting examples of PD7 forms and their treatment under the old and new systems) and quantitatively (by comparison to T4 data).

    Release date: 2009-12-03

  • Technical products: 11-522-X200800011001
    Description:

    Currently underway, the Québec Population Health Survey (EQSP), for which collection will wrap up in February 2009, provides an opportunity, because of the size of its sample, to assess the impact that sending out introductory letters to respondents has on the response rate in a controlled environment. Since this regional telephone survey is expected to have more than 38,000 respondents, it was possible to use part of its sample for this study without having too great an impact on its overall response rate. In random digit dialling (RDD) surveys such as the EQSP, one of the main challenges in sending out introductory letters is reaching the survey units. Doing so depends largely on our capacity to associate an address with the sample units and on the quality of that information.

    This article describes the controlled study proposed by the Institut de la statistique du Québec to measure the effect that sending out introductory letters to respondents had on the survey's response rate.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010952
    Description:

    In a survey where results were estimated by simple averages, we will compare the effect on the results of a follow-up among non-respondents, and weighting based on the last ten percents of the respondents. The data used are collected from a Survey of Living Conditions among Immigrants in Norway that was carried out in 2006.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010946
    Description:

    In the mid 1990s the first question testing unit was set-up in the UK Office for National Statistics (ONS). The key objective of the unit was to develop and test the questions and questionnaire for the 2001 Census. Since the establishment of this unit the area has been expanded into a Data Collection Methodology (DCM) Centre of Expertise which now sits in the Methodology Directorate. The DCM centre has three branches which support DCM work for social surveys, business surveys, the Census and external organisations.

    In the past ten years DCM has achieved a variety of things. For example, introduced survey methodology involvement in the development and testing of business survey question(naire)s; introduced a mix-method approach to the development of questions and questionnaires; developed and implemented standards e.g. for the 2011 census questionnaire & showcards; and developed and delivered DCM training events.

    This paper will provide an overview of data collection methodology at the ONS from the perspective of achievements and challenges. It will cover areas such as methods, staff (e.g. recruitment, development and field security), and integration with the survey process.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010980
    Description:

    A census is the largest and possibly one of the most complex data collection operations undertaken by a government. Many of the challenges encountered are linked to the sheer size of the operation, when millions of dwellings need to be contacted, and thousands of people must be mobilized to help in the data collection efforts. Statistics Canada is a world leader in its approaches to census data collection. New collection approaches were introduced with the 2006 Census, more particularly an Internet response option, to add to the mail-out, telephone and face-to-face collection approaches. Such diversity in data collection methods requires an integrated approach to management to ensure quality and efficiency in an environment of declining survey response rates and a tighter fiscal framework. In preparing for its' 2011 Census, Statistics Canada is putting in place a number of new systems and processes to actively manage field data collection operations. One of the key elements of the approach will be a Field Management System which will allow the majority of field personnel to register enumeration progress in the field, and be informed in a very timely fashion of questionnaires received at the Data Operations Centre via Internet, by mail or other channels, thus informing them to cease non-response follow up efforts on those dwellings, in an attempt to eliminate unnecessary follow-up work.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800011003
    Description:

    This study examined the feasibility of developing correction factors to adjust self-reported measures of Body Mass Index to more closely approximate measured values. Data are from the 2005 Canadian Community Health Survey where respondents were asked to report their height and weight and were subsequently measured. Regression analyses were used to determine which socio-demographic and health characteristics were associated with the discrepancies between reported and measured values. The sample was then split into two groups. In the first, the self-reported BMI and the predictors of the discrepancies were regressed on the measured BMI. Correction equations were generated using all predictor variables that were significant at the p<0.05 level. These correction equations were then tested in the second group to derive estimates of sensitivity, specificity and of obesity prevalence. Logistic regression was used to examine the relationship between measured, reported and corrected BMI and obesity-related health conditions. Corrected estimates provided more accurate measures of obesity prevalence, mean BMI and sensitivity levels. Self-reported data exaggerated the relationship between BMI and health conditions, while in most cases the corrected estimates provided odds ratios that were more similar to those generated with the measured BMI.

    Release date: 2009-12-03

  • Technical products: 11-522-X200800010971
    Description:

    Keynote address

    Release date: 2009-12-03

Date modified: