Survey design

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Type

2 facets displayed. 0 facets selected.

Geography

1 facets displayed. 0 facets selected.

Survey or statistical program

1 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (12)

All (12) (0 to 10 of 12 results)

  • Articles and reports: 12-001-X20060029546
    Description:

    We discuss methods for the analysis of case-control studies in which the controls are drawn using a complex sample survey. The most straightforward method is the standard survey approach based on weighted versions of population estimating equations. We also look at more efficient methods and compare their robustness to model mis-specification in simple cases. Case-control family studies, where the within-cluster structure is of interest in its own right, are also discussed briefly.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20060029550
    Description:

    In this paper, the geometric, optimization-based, and Lavallée and Hidiroglou (LH) approaches to stratification are compared. The geometric stratification method is an approximation, whereas the other two approaches, which employ numerical methods to perform stratification, may be seen as optimal stratification methods. The algorithm of the geometric stratification is very simple compared to the two other approaches, but it does not take into account the construction of a take-all stratum, which is usually constructed when a positively skewed population is stratified. In the optimization-based stratification, one may consider any form of optimization function and its constraints. In a comparative numerical study based on five positively skewed artificial populations, the optimization approach was more efficient in each of the cases studied compared to the geometric stratification. In addition, the geometric and optimization approaches are compared with the LH algorithm. In this comparison, the geometric stratification approach was found to be less efficient than the LH algorithm, whereas efficiency of the optimization approach was similar to the efficiency of the LH algorithm. Nevertheless, strata boundaries evaluated via the geometric stratification may be seen as efficient starting points for the optimization approach.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20060029552
    Description:

    A survey of tourist visits originating intra and extra-region in Brittany was needed. For concrete material reasons, "border surveys" could no longer be used. The major problem is the lack of a sampling frame that allows for direct contact with tourists. This problem was addressed by applying the indirect sampling method, the weighting for which is obtained using the generalized weight share method developed recently by Lavallée (1995), Lavallée (2002), Deville (1999) and also presented recently in Lavallée and Caron (2001). This article shows how to adapt the method to the survey. A number of extensions are required. One of the extensions, designed to estimate the total of a population from which a Bernouilli sample has been taken, will be developed.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20060029553
    Description:

    Félix-Medina and Thompson (2004) proposed a variant of Link-tracing sampling in which it is assumed that a portion of the population, not necessarily the major portion, is covered by a frame of disjoint sites where members of the population can be found with high probabilities. A sample of sites is selected and the people in each of the selected sites are asked to nominate other members of the population. They proposed maximum likelihood estimators of the population sizes which perform acceptably provided that for each site the probability that a member is nominated by that site, called the nomination probability, is not small. In this research we consider Félix-Medina and Thompson's variant and propose three sets of estimators of the population sizes derived under the Bayesian approach. Two of the sets of estimators were obtained using improper prior distributions of the population sizes, and the other using Poisson prior distributions. However, we use the Bayesian approach only to assist us in the construction of estimators, while inferences about the population sizes are made under the frequentist approach. We propose two types of partly design-based variance estimators and confidence intervals. One of them is obtained using a bootstrap and the other using the delta method along with the assumption of asymptotic normality. The results of a simulation study indicate that (i) when the nomination probabilities are not small each of the proposed sets of estimators performs well and very similarly to maximum likelihood estimators; (ii) when the nomination probabilities are small the set of estimators derived using Poisson prior distributions still performs acceptably and does not have the problems of bias that maximum likelihood estimators have, and (iii) the previous results do not depend on the size of the fraction of the population covered by the frame.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20060029554
    Description:

    Survey sampling to estimate a Consumer Price Index (CPI) is quite complicated, generally requiring a combination of data from at least two surveys: one giving prices, one giving expenditure weights. Fundamentally different approaches to the sampling process - probability sampling and purposive sampling - have each been strongly advocated and are used by different countries in the collection of price data. By constructing a small "world" of purchases and prices from scanner data on cereal and then simulating various sampling and estimation techniques, we compare the results of two design and estimation approaches: the probability approach of the United States and the purposive approach of the United Kingdom. For the same amount of information collected, but given the use of different estimators, the United Kingdom's methods appear to offer better overall accuracy in targeting a population superlative consumer price index.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20060029555
    Description:

    Researchers and policy makers often use data from nationally representative probability sample surveys. The number of topics covered by such surveys, and hence the amount of interviewing time involved, have typically increased over the years, resulting in increased costs and respondent burden. A potential solution to this problem is to carefully form subsets of the items in a survey and administer one such subset to each respondent. Designs of this type are called "split-questionnaire" designs or "matrix sampling" designs. The administration of only a subset of the survey items to each respondent in a matrix sampling design creates what can be considered missing data. Multiple imputation (Rubin 1987), a general-purpose approach developed for handling data with missing values, is appealing for the analysis of data from a matrix sample, because once the multiple imputations are created, data analysts can apply standard methods for analyzing complete data from a sample survey. This paper develops and evaluates a method for creating matrix sampling forms, each form containing a subset of items to be administered to randomly selected respondents. The method can be applied in complex settings, including situations in which skip patterns are present. Forms are created in such a way that each form includes items that are predictive of the excluded items, so that subsequent analyses based on multiple imputation can recover some of the information about the excluded items that would have been collected had there been no matrix sampling. The matrix sampling and multiple-imputation methods are evaluated using data from the National Health and Nutrition Examination Survey, one of many nationally representative probability sample surveys conducted by the National Center for Health Statistics, Centers for Disease Control and Prevention. The study demonstrates the feasibility of the approach applied to a major national health survey with complex structure, and it provides practical advice about appropriate items to include in matrix sampling designs in future surveys.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20060019256
    Description:

    In some situations the sample design of a survey is rather complex, consisting of fundamentally different designs in different domains. The design effect for estimates based upon the total sample is a weighted sum of the domain-specific design effects. We derive these weights under an appropriate model and illustrate their use with data from the European Social Survey (ESS).

    Release date: 2006-07-20

  • Articles and reports: 12-001-X20060019259
    Description:

    We describe a general approach to setting the sampling design in surveys that are planned for making inferences about small areas (sub-domains). The approach requires a specification of the inferential priorities for the areas. Sample size allocation schemes are derived first for the direct estimator and then for composite and empirical Bayes estimators. The methods are illustrated on an example of planning a survey of the population of Switzerland and estimating the mean or proportion of a variable for each of its 26 cantons.

    Release date: 2006-07-20

  • Articles and reports: 12-001-X20060019261
    Description:

    Sample allocation can be optimized with respect to various goals. When there is more than one goal, a compromise allocation must be chosen. In the past, the Reverse Record Check achieved that compromise by having a certain fraction of the sample optimally allocated for each goal (for example, two thirds of the sample is allocated to produce good-quality provincial estimates, and one third to produce a good-quality national estimate). This paper suggests a method that involves selecting the maximum of two or more optimal allocations. By analyzing the impact that the precision of population estimates has on the federal government's equalization payments to the provinces, we can set four goals for the Reverse Record Check's provincial sample allocation. The Reverse Record Check's subprovincial sample allocation requires the smoothing of stratum-level parameters. This paper shows how calibration can be used to achieve this smoothing. The calibration problem and its solution do not assume that the calibration constraints have a solution. This avoids convergence problems inherent in related methods such as the raking ratio.

    Release date: 2006-07-20

  • Articles and reports: 12-001-X20060019262
    Description:

    Hidden human populations, the Internet, and other networked structures conceptualized mathematically as graphs are inherently hard to sample by conventional means, and the most effective study designs usually involve procedures that select the sample by adaptively following links from one node to another. Sample data obtained in such studies are generally not representative at face value of the larger population of interest. However, a number of design and model based methods are now available for effective inference from such samples. The design based methods have the advantage that they do not depend on an assumed population model, but do depend for their validity on the design being implemented in a controlled and known way, which can be difficult or impossible in practice. The model based methods allow greater flexibly in the design, but depend on modeling of the population using stochastic graph models and also depend on the design being ignorable or of known form so that it can be included in the likelihood or Bayes equations. For both the design and the model based methods, the weak point often is the lack of control in how the initial sample is obtained, from which link-tracing commences. The designs described in this paper offer a third way, in which the sample selection probabilities become step by step less dependent on the initial sample selection. A Markov chain "random walk" model idealizes the natural design tendencies of a link-tracing selection sequence through a graph. This paper introduces uniform and targeted walk designs in which the random walk is nudged at each step to produce a design with the desired stationary probabilities. A sample is thus obtained that in important respects is representative at face value of the larger population of interest, or that requires only simple weighting factors to make it so.

    Release date: 2006-07-20
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (12)

Analysis (12) (0 to 10 of 12 results)

  • Articles and reports: 12-001-X20060029546
    Description:

    We discuss methods for the analysis of case-control studies in which the controls are drawn using a complex sample survey. The most straightforward method is the standard survey approach based on weighted versions of population estimating equations. We also look at more efficient methods and compare their robustness to model mis-specification in simple cases. Case-control family studies, where the within-cluster structure is of interest in its own right, are also discussed briefly.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20060029550
    Description:

    In this paper, the geometric, optimization-based, and Lavallée and Hidiroglou (LH) approaches to stratification are compared. The geometric stratification method is an approximation, whereas the other two approaches, which employ numerical methods to perform stratification, may be seen as optimal stratification methods. The algorithm of the geometric stratification is very simple compared to the two other approaches, but it does not take into account the construction of a take-all stratum, which is usually constructed when a positively skewed population is stratified. In the optimization-based stratification, one may consider any form of optimization function and its constraints. In a comparative numerical study based on five positively skewed artificial populations, the optimization approach was more efficient in each of the cases studied compared to the geometric stratification. In addition, the geometric and optimization approaches are compared with the LH algorithm. In this comparison, the geometric stratification approach was found to be less efficient than the LH algorithm, whereas efficiency of the optimization approach was similar to the efficiency of the LH algorithm. Nevertheless, strata boundaries evaluated via the geometric stratification may be seen as efficient starting points for the optimization approach.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20060029552
    Description:

    A survey of tourist visits originating intra and extra-region in Brittany was needed. For concrete material reasons, "border surveys" could no longer be used. The major problem is the lack of a sampling frame that allows for direct contact with tourists. This problem was addressed by applying the indirect sampling method, the weighting for which is obtained using the generalized weight share method developed recently by Lavallée (1995), Lavallée (2002), Deville (1999) and also presented recently in Lavallée and Caron (2001). This article shows how to adapt the method to the survey. A number of extensions are required. One of the extensions, designed to estimate the total of a population from which a Bernouilli sample has been taken, will be developed.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20060029553
    Description:

    Félix-Medina and Thompson (2004) proposed a variant of Link-tracing sampling in which it is assumed that a portion of the population, not necessarily the major portion, is covered by a frame of disjoint sites where members of the population can be found with high probabilities. A sample of sites is selected and the people in each of the selected sites are asked to nominate other members of the population. They proposed maximum likelihood estimators of the population sizes which perform acceptably provided that for each site the probability that a member is nominated by that site, called the nomination probability, is not small. In this research we consider Félix-Medina and Thompson's variant and propose three sets of estimators of the population sizes derived under the Bayesian approach. Two of the sets of estimators were obtained using improper prior distributions of the population sizes, and the other using Poisson prior distributions. However, we use the Bayesian approach only to assist us in the construction of estimators, while inferences about the population sizes are made under the frequentist approach. We propose two types of partly design-based variance estimators and confidence intervals. One of them is obtained using a bootstrap and the other using the delta method along with the assumption of asymptotic normality. The results of a simulation study indicate that (i) when the nomination probabilities are not small each of the proposed sets of estimators performs well and very similarly to maximum likelihood estimators; (ii) when the nomination probabilities are small the set of estimators derived using Poisson prior distributions still performs acceptably and does not have the problems of bias that maximum likelihood estimators have, and (iii) the previous results do not depend on the size of the fraction of the population covered by the frame.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20060029554
    Description:

    Survey sampling to estimate a Consumer Price Index (CPI) is quite complicated, generally requiring a combination of data from at least two surveys: one giving prices, one giving expenditure weights. Fundamentally different approaches to the sampling process - probability sampling and purposive sampling - have each been strongly advocated and are used by different countries in the collection of price data. By constructing a small "world" of purchases and prices from scanner data on cereal and then simulating various sampling and estimation techniques, we compare the results of two design and estimation approaches: the probability approach of the United States and the purposive approach of the United Kingdom. For the same amount of information collected, but given the use of different estimators, the United Kingdom's methods appear to offer better overall accuracy in targeting a population superlative consumer price index.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20060029555
    Description:

    Researchers and policy makers often use data from nationally representative probability sample surveys. The number of topics covered by such surveys, and hence the amount of interviewing time involved, have typically increased over the years, resulting in increased costs and respondent burden. A potential solution to this problem is to carefully form subsets of the items in a survey and administer one such subset to each respondent. Designs of this type are called "split-questionnaire" designs or "matrix sampling" designs. The administration of only a subset of the survey items to each respondent in a matrix sampling design creates what can be considered missing data. Multiple imputation (Rubin 1987), a general-purpose approach developed for handling data with missing values, is appealing for the analysis of data from a matrix sample, because once the multiple imputations are created, data analysts can apply standard methods for analyzing complete data from a sample survey. This paper develops and evaluates a method for creating matrix sampling forms, each form containing a subset of items to be administered to randomly selected respondents. The method can be applied in complex settings, including situations in which skip patterns are present. Forms are created in such a way that each form includes items that are predictive of the excluded items, so that subsequent analyses based on multiple imputation can recover some of the information about the excluded items that would have been collected had there been no matrix sampling. The matrix sampling and multiple-imputation methods are evaluated using data from the National Health and Nutrition Examination Survey, one of many nationally representative probability sample surveys conducted by the National Center for Health Statistics, Centers for Disease Control and Prevention. The study demonstrates the feasibility of the approach applied to a major national health survey with complex structure, and it provides practical advice about appropriate items to include in matrix sampling designs in future surveys.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20060019256
    Description:

    In some situations the sample design of a survey is rather complex, consisting of fundamentally different designs in different domains. The design effect for estimates based upon the total sample is a weighted sum of the domain-specific design effects. We derive these weights under an appropriate model and illustrate their use with data from the European Social Survey (ESS).

    Release date: 2006-07-20

  • Articles and reports: 12-001-X20060019259
    Description:

    We describe a general approach to setting the sampling design in surveys that are planned for making inferences about small areas (sub-domains). The approach requires a specification of the inferential priorities for the areas. Sample size allocation schemes are derived first for the direct estimator and then for composite and empirical Bayes estimators. The methods are illustrated on an example of planning a survey of the population of Switzerland and estimating the mean or proportion of a variable for each of its 26 cantons.

    Release date: 2006-07-20

  • Articles and reports: 12-001-X20060019261
    Description:

    Sample allocation can be optimized with respect to various goals. When there is more than one goal, a compromise allocation must be chosen. In the past, the Reverse Record Check achieved that compromise by having a certain fraction of the sample optimally allocated for each goal (for example, two thirds of the sample is allocated to produce good-quality provincial estimates, and one third to produce a good-quality national estimate). This paper suggests a method that involves selecting the maximum of two or more optimal allocations. By analyzing the impact that the precision of population estimates has on the federal government's equalization payments to the provinces, we can set four goals for the Reverse Record Check's provincial sample allocation. The Reverse Record Check's subprovincial sample allocation requires the smoothing of stratum-level parameters. This paper shows how calibration can be used to achieve this smoothing. The calibration problem and its solution do not assume that the calibration constraints have a solution. This avoids convergence problems inherent in related methods such as the raking ratio.

    Release date: 2006-07-20

  • Articles and reports: 12-001-X20060019262
    Description:

    Hidden human populations, the Internet, and other networked structures conceptualized mathematically as graphs are inherently hard to sample by conventional means, and the most effective study designs usually involve procedures that select the sample by adaptively following links from one node to another. Sample data obtained in such studies are generally not representative at face value of the larger population of interest. However, a number of design and model based methods are now available for effective inference from such samples. The design based methods have the advantage that they do not depend on an assumed population model, but do depend for their validity on the design being implemented in a controlled and known way, which can be difficult or impossible in practice. The model based methods allow greater flexibly in the design, but depend on modeling of the population using stochastic graph models and also depend on the design being ignorable or of known form so that it can be included in the likelihood or Bayes equations. For both the design and the model based methods, the weak point often is the lack of control in how the initial sample is obtained, from which link-tracing commences. The designs described in this paper offer a third way, in which the sample selection probabilities become step by step less dependent on the initial sample selection. A Markov chain "random walk" model idealizes the natural design tendencies of a link-tracing selection sequence through a graph. This paper introduces uniform and targeted walk designs in which the random walk is nudged at each step to produce a design with the desired stationary probabilities. A sample is thus obtained that in important respects is representative at face value of the larger population of interest, or that requires only simple weighting factors to make it so.

    Release date: 2006-07-20
Reference (0)

Reference (0) (0 results)

No content available at this time.

Date modified: