Survey design

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Geography

1 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (10)

All (10) ((10 results))

  • Journals and periodicals: 92-395-X
    Description:

    This report describes sampling and weighting procedures used in the 2001 Census. It reviews the history of these procedures in Canadian censuses, provides operational and theoretical justifications for them, and presents the results of the evaluation studies of these procedures.

    Release date: 2004-12-15

  • Articles and reports: 11-522-X20020016721
    Description:

    This paper examines the simulation study that was conducted to assess the sampling scheme designed for the World Health Organization (WHO) Injection Safety Assessment Survey. The objective of this assessment survey is to determine whether facilities in which injections are given meet the necessary safety requirements for injection administration, equipment, supplies and waste disposal. The main parameter of interest is the proportion of health care facilities in a country that have safe injection practices.

    The objective of this simulation study was to assess the accuracy and precision of the proposed sampling design. To this end, two artificial populations were created based on the two African countries of Niger and Burkina Faso, in which the pilot survey was tested. To create a wide variety of hypothetical populations, the assignment of whether a health care facility was safe or not was based on the different combinations of the population proportion of safe health care facilities in the country, the homogeneity of the districts in the country with respect to injection safety, and whether the health care facility was located in an urban or rural district.

    Using the results of the simulation, a multi-factor analysis of variance was used to determine which factors affect the outcome measures of absolute bias, standard error and mean-squared error.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016736
    Description:

    The US Census Bureau supports research into an optimal design program as an alternative to its current decennial redesign of demographic surveys. The optimal design program seeks to optimize redesign samples annually and reduce deterioration of the precision of survey estimates.

    Initial research has focussed on the use of multi-agent systems (also known as distributed artificial intelligence) to produce optimal annual samples for all demographic surveys. The first multi-agent system optimizes redesign inputs. It represents each housing unit as an autonomous agent and solves the distributed constrain satisfaction problem (DCSP) to forecast household characteristics that are consistent with recent survey data and estimates. The second multi-agent system selects optimal samples for all demographic surveys. It represents each survey-state pair as a deliberative agent and applies the Bayesian optimization algorithm (BOA) at each design stage to partition the sampling units into sample and non-sample subsets. Thus, sampling units are selected directly, without the need for initial stratification.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016737
    Description:

    If the dataset available to machine learning results from cluster sampling (e.g., patients from a sample of hospital wards), the usual cross-validation error rate estimate can lead to biased and misleading results. In this technical paper, an adapted cross-validation is described for this case. Using a simulation, the sampling distribution of the generalization error rate estimate, under cluster or simple random sampling hypothesis, is compared with the true value. The results highlight the impact of the sampling design on inference: clearly, clustering has a significant impact; the repartition between learning set and test set should result from a random partition of the clusters, not from a random partition of the examples. With cluster sampling, standard cross-validation underestimates the generalization error rate, and is deficient for model selection. These results are illustrated with a real application of automatic identification of spoken language.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016748
    Description:

    Practitioners often use data collected from complex surveys (such as labour force and health surveys involving stratified cluster sampling) to fit logistic regression and other models of interest. A great deal of effort over the last two decades has been spent on developing methods to analyse survey data that take account of design features. This paper looks at an alternative method known as inverse sampling.

    Specialized programs, such as SUDAAN and WESVAR, are also available to implement some of the methods developed to take into account the design features. However, these methods require additional information such as survey weights, design effects or cluster identification of microdata and thus, another method is necessary.

    Inverse sampling (Hinkins et al., Survey Methodology, 1977) provides an alternative approach by undoing the complex data structures so that standard methods can be applied. Repeated subsamples with simple random structure are drawn and each subsample is analysed by standard methods and is combined to increase the efficiency. Although computer-intensive, this method has the potential to preserve confidentiality of microdata files. A drawback of the method is that it can lead to biased estimates of regression parameters when the subsample sizes are small (as in the case of stratified cluster sampling).

    In this paper, we propose using the estimating equation approach that combines the subsamples before estimation and thus leads to nearly unbiased estimates of regression parameters regardless of subsample sizes. This method is computationally less intensive than the original method. We apply the method to cluster-correlated data generated from a nested error linear regression model to illustrate its advantages. A real dataset from a Statistics Canada survey will also be analysed using the estimating equation method.

    Release date: 2004-09-13

  • Surveys and statistical programs – Documentation: 75F0002M2004006
    Description:

    This document presents information about the entry-exit portion of the annual labour and the income interviews of the Survey of Labour and Income Dynamics (SLID).

    Release date: 2004-06-21

  • Articles and reports: 82-003-X20030036847
    Geography: Canada
    Description:

    This paper examines whether accepting proxy- instead of self-responses results in lower estimates of some health conditions. It analyses data from the National Population Health Survey and the Canadian Community Health Survey.

    Release date: 2004-05-18

  • Articles and reports: 88F0006X2004006
    Description:

    Biotechnology is a pervasive technology used in several industrial sectors, making collecting sound data a real challenge. This paper describes the methodology of the Biotechnology Use and Development Survey. Some of the specific issues dealt with are the definitions of biotechnology and innovative biotechnology firms, the target population, sampling, data collection procedures, and data quality evaluation.

    Release date: 2004-03-05

  • Articles and reports: 12-001-X20030026782
    Description:

    This paper discusses both the general question of designing a post-enumeration survey, and how these general questions were addressed in the U.S. Census Bureau's coverage measurement planned as part of Census 2000. It relates the basic concepts of the Dual System Estimator to questions of the definition and measurement of correct enumerations, the measurement of census omissions, operational independence, reporting of residence, and the role of after-matching reinterview. It discusses estimation issues such as the treatment of movers, missing data, and synthetic estimation of local corrected population size. It also discusses where the design failed in Census 2000.

    Release date: 2004-01-27

  • Articles and reports: 12-001-X20030026787
    Description:

    Application of classical statistical methods to data from complex sample surveys without making allowance for the survey design features can lead to erroneous inferences. Methods have been developed that account for the survey design, but these methods require additional information such as survey weights, design effects or cluster identification for microdata. Inverse sampling (Hinkins, Oh and Scheuren 1997) provides an alternative approach by undoing the complex survey data structures so that standard methods can be applied. Repeated subsamples with unconditional simple random sampling structure are drawn and each subsample analysed by standard methods and then combined to increase the efficiency. This method has the potential to preserve confidentiality of microdata, although computer-intensive. We present some theory of inverse sampling and explore its limitations. A combined estimating equations approach is proposed for handling complex parameters such as ratios and "census" linear regression and logistic regression parameters. The method is applied to a cluster correlated data set reported in Battese, Harter and Fuller (1988).

    Release date: 2004-01-27
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (9)

Analysis (9) ((9 results))

  • Journals and periodicals: 92-395-X
    Description:

    This report describes sampling and weighting procedures used in the 2001 Census. It reviews the history of these procedures in Canadian censuses, provides operational and theoretical justifications for them, and presents the results of the evaluation studies of these procedures.

    Release date: 2004-12-15

  • Articles and reports: 11-522-X20020016721
    Description:

    This paper examines the simulation study that was conducted to assess the sampling scheme designed for the World Health Organization (WHO) Injection Safety Assessment Survey. The objective of this assessment survey is to determine whether facilities in which injections are given meet the necessary safety requirements for injection administration, equipment, supplies and waste disposal. The main parameter of interest is the proportion of health care facilities in a country that have safe injection practices.

    The objective of this simulation study was to assess the accuracy and precision of the proposed sampling design. To this end, two artificial populations were created based on the two African countries of Niger and Burkina Faso, in which the pilot survey was tested. To create a wide variety of hypothetical populations, the assignment of whether a health care facility was safe or not was based on the different combinations of the population proportion of safe health care facilities in the country, the homogeneity of the districts in the country with respect to injection safety, and whether the health care facility was located in an urban or rural district.

    Using the results of the simulation, a multi-factor analysis of variance was used to determine which factors affect the outcome measures of absolute bias, standard error and mean-squared error.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016736
    Description:

    The US Census Bureau supports research into an optimal design program as an alternative to its current decennial redesign of demographic surveys. The optimal design program seeks to optimize redesign samples annually and reduce deterioration of the precision of survey estimates.

    Initial research has focussed on the use of multi-agent systems (also known as distributed artificial intelligence) to produce optimal annual samples for all demographic surveys. The first multi-agent system optimizes redesign inputs. It represents each housing unit as an autonomous agent and solves the distributed constrain satisfaction problem (DCSP) to forecast household characteristics that are consistent with recent survey data and estimates. The second multi-agent system selects optimal samples for all demographic surveys. It represents each survey-state pair as a deliberative agent and applies the Bayesian optimization algorithm (BOA) at each design stage to partition the sampling units into sample and non-sample subsets. Thus, sampling units are selected directly, without the need for initial stratification.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016737
    Description:

    If the dataset available to machine learning results from cluster sampling (e.g., patients from a sample of hospital wards), the usual cross-validation error rate estimate can lead to biased and misleading results. In this technical paper, an adapted cross-validation is described for this case. Using a simulation, the sampling distribution of the generalization error rate estimate, under cluster or simple random sampling hypothesis, is compared with the true value. The results highlight the impact of the sampling design on inference: clearly, clustering has a significant impact; the repartition between learning set and test set should result from a random partition of the clusters, not from a random partition of the examples. With cluster sampling, standard cross-validation underestimates the generalization error rate, and is deficient for model selection. These results are illustrated with a real application of automatic identification of spoken language.

    Release date: 2004-09-13

  • Articles and reports: 11-522-X20020016748
    Description:

    Practitioners often use data collected from complex surveys (such as labour force and health surveys involving stratified cluster sampling) to fit logistic regression and other models of interest. A great deal of effort over the last two decades has been spent on developing methods to analyse survey data that take account of design features. This paper looks at an alternative method known as inverse sampling.

    Specialized programs, such as SUDAAN and WESVAR, are also available to implement some of the methods developed to take into account the design features. However, these methods require additional information such as survey weights, design effects or cluster identification of microdata and thus, another method is necessary.

    Inverse sampling (Hinkins et al., Survey Methodology, 1977) provides an alternative approach by undoing the complex data structures so that standard methods can be applied. Repeated subsamples with simple random structure are drawn and each subsample is analysed by standard methods and is combined to increase the efficiency. Although computer-intensive, this method has the potential to preserve confidentiality of microdata files. A drawback of the method is that it can lead to biased estimates of regression parameters when the subsample sizes are small (as in the case of stratified cluster sampling).

    In this paper, we propose using the estimating equation approach that combines the subsamples before estimation and thus leads to nearly unbiased estimates of regression parameters regardless of subsample sizes. This method is computationally less intensive than the original method. We apply the method to cluster-correlated data generated from a nested error linear regression model to illustrate its advantages. A real dataset from a Statistics Canada survey will also be analysed using the estimating equation method.

    Release date: 2004-09-13

  • Articles and reports: 82-003-X20030036847
    Geography: Canada
    Description:

    This paper examines whether accepting proxy- instead of self-responses results in lower estimates of some health conditions. It analyses data from the National Population Health Survey and the Canadian Community Health Survey.

    Release date: 2004-05-18

  • Articles and reports: 88F0006X2004006
    Description:

    Biotechnology is a pervasive technology used in several industrial sectors, making collecting sound data a real challenge. This paper describes the methodology of the Biotechnology Use and Development Survey. Some of the specific issues dealt with are the definitions of biotechnology and innovative biotechnology firms, the target population, sampling, data collection procedures, and data quality evaluation.

    Release date: 2004-03-05

  • Articles and reports: 12-001-X20030026782
    Description:

    This paper discusses both the general question of designing a post-enumeration survey, and how these general questions were addressed in the U.S. Census Bureau's coverage measurement planned as part of Census 2000. It relates the basic concepts of the Dual System Estimator to questions of the definition and measurement of correct enumerations, the measurement of census omissions, operational independence, reporting of residence, and the role of after-matching reinterview. It discusses estimation issues such as the treatment of movers, missing data, and synthetic estimation of local corrected population size. It also discusses where the design failed in Census 2000.

    Release date: 2004-01-27

  • Articles and reports: 12-001-X20030026787
    Description:

    Application of classical statistical methods to data from complex sample surveys without making allowance for the survey design features can lead to erroneous inferences. Methods have been developed that account for the survey design, but these methods require additional information such as survey weights, design effects or cluster identification for microdata. Inverse sampling (Hinkins, Oh and Scheuren 1997) provides an alternative approach by undoing the complex survey data structures so that standard methods can be applied. Repeated subsamples with unconditional simple random sampling structure are drawn and each subsample analysed by standard methods and then combined to increase the efficiency. This method has the potential to preserve confidentiality of microdata, although computer-intensive. We present some theory of inverse sampling and explore its limitations. A combined estimating equations approach is proposed for handling complex parameters such as ratios and "census" linear regression and logistic regression parameters. The method is applied to a cluster correlated data set reported in Battese, Harter and Fuller (1988).

    Release date: 2004-01-27
Reference (1)

Reference (1) ((1 result))

Date modified: