Editing and imputation

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Type

1 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (3)

All (3) ((3 results))

  • Articles and reports: 12-001-X20060029555
    Description:

    Researchers and policy makers often use data from nationally representative probability sample surveys. The number of topics covered by such surveys, and hence the amount of interviewing time involved, have typically increased over the years, resulting in increased costs and respondent burden. A potential solution to this problem is to carefully form subsets of the items in a survey and administer one such subset to each respondent. Designs of this type are called "split-questionnaire" designs or "matrix sampling" designs. The administration of only a subset of the survey items to each respondent in a matrix sampling design creates what can be considered missing data. Multiple imputation (Rubin 1987), a general-purpose approach developed for handling data with missing values, is appealing for the analysis of data from a matrix sample, because once the multiple imputations are created, data analysts can apply standard methods for analyzing complete data from a sample survey. This paper develops and evaluates a method for creating matrix sampling forms, each form containing a subset of items to be administered to randomly selected respondents. The method can be applied in complex settings, including situations in which skip patterns are present. Forms are created in such a way that each form includes items that are predictive of the excluded items, so that subsequent analyses based on multiple imputation can recover some of the information about the excluded items that would have been collected had there been no matrix sampling. The matrix sampling and multiple-imputation methods are evaluated using data from the National Health and Nutrition Examination Survey, one of many nationally representative probability sample surveys conducted by the National Center for Health Statistics, Centers for Disease Control and Prevention. The study demonstrates the feasibility of the approach applied to a major national health survey with complex structure, and it provides practical advice about appropriate items to include in matrix sampling designs in future surveys.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20050029044
    Description:

    Complete data methods for estimating the variances of survey estimates are biased when some data are imputed. This paper uses simulation to compare the performance of the model-assisted, the adjusted jackknife, and the multiple imputation methods for estimating the variance of a total when missing items have been imputed using hot deck imputation. The simulation studies the properties of the variance estimates for imputed estimates of totals for the full population and for domains from a single-stage disproportionate stratified sample design when underlying assumptions, such as unbiasedness of the point estimate and item responses being randomly missing within hot deck cells, do not hold. The variance estimators for full population estimates produce confidence intervals with coverage rates near the nominal level even under modest departures from the assumptions, but this finding does not apply for the domain estimates. Coverage is most sensitive to bias in the point estimates. As the simulation demonstrates, even if an imputation method gives almost unbiased estimates for the full population, estimates for domains may be very biased.

    Release date: 2006-02-17

  • Articles and reports: 12-001-X197800254833
    Description: Owners of small businesses complain about the quantity of forms they are required to collectors of statistics. Administrative data are an alternative source but do not usually include all the information required by the survey takers.

    The “Tax Data Imputation System” makes use of tax data collected from a large number of businesses by Revenue Canada and data obtained by sample survey for a small subset of these businesses. Survey data is imputed (estimated) for all the businesses not actually surveyed using a “hot-deck” technique, with adjustments made to ensure certain edit rules are satisfied. The results of a simulation study suggest that this procedure has reasonable statistical properties. Estimators (of means or totals) are unbiased with variances of comparable size to the corresponding ratio estimators.
    Release date: 1978-12-15
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (3)

Analysis (3) ((3 results))

  • Articles and reports: 12-001-X20060029555
    Description:

    Researchers and policy makers often use data from nationally representative probability sample surveys. The number of topics covered by such surveys, and hence the amount of interviewing time involved, have typically increased over the years, resulting in increased costs and respondent burden. A potential solution to this problem is to carefully form subsets of the items in a survey and administer one such subset to each respondent. Designs of this type are called "split-questionnaire" designs or "matrix sampling" designs. The administration of only a subset of the survey items to each respondent in a matrix sampling design creates what can be considered missing data. Multiple imputation (Rubin 1987), a general-purpose approach developed for handling data with missing values, is appealing for the analysis of data from a matrix sample, because once the multiple imputations are created, data analysts can apply standard methods for analyzing complete data from a sample survey. This paper develops and evaluates a method for creating matrix sampling forms, each form containing a subset of items to be administered to randomly selected respondents. The method can be applied in complex settings, including situations in which skip patterns are present. Forms are created in such a way that each form includes items that are predictive of the excluded items, so that subsequent analyses based on multiple imputation can recover some of the information about the excluded items that would have been collected had there been no matrix sampling. The matrix sampling and multiple-imputation methods are evaluated using data from the National Health and Nutrition Examination Survey, one of many nationally representative probability sample surveys conducted by the National Center for Health Statistics, Centers for Disease Control and Prevention. The study demonstrates the feasibility of the approach applied to a major national health survey with complex structure, and it provides practical advice about appropriate items to include in matrix sampling designs in future surveys.

    Release date: 2006-12-21

  • Articles and reports: 12-001-X20050029044
    Description:

    Complete data methods for estimating the variances of survey estimates are biased when some data are imputed. This paper uses simulation to compare the performance of the model-assisted, the adjusted jackknife, and the multiple imputation methods for estimating the variance of a total when missing items have been imputed using hot deck imputation. The simulation studies the properties of the variance estimates for imputed estimates of totals for the full population and for domains from a single-stage disproportionate stratified sample design when underlying assumptions, such as unbiasedness of the point estimate and item responses being randomly missing within hot deck cells, do not hold. The variance estimators for full population estimates produce confidence intervals with coverage rates near the nominal level even under modest departures from the assumptions, but this finding does not apply for the domain estimates. Coverage is most sensitive to bias in the point estimates. As the simulation demonstrates, even if an imputation method gives almost unbiased estimates for the full population, estimates for domains may be very biased.

    Release date: 2006-02-17

  • Articles and reports: 12-001-X197800254833
    Description: Owners of small businesses complain about the quantity of forms they are required to collectors of statistics. Administrative data are an alternative source but do not usually include all the information required by the survey takers.

    The “Tax Data Imputation System” makes use of tax data collected from a large number of businesses by Revenue Canada and data obtained by sample survey for a small subset of these businesses. Survey data is imputed (estimated) for all the businesses not actually surveyed using a “hot-deck” technique, with adjustments made to ensure certain edit rules are satisfied. The results of a simulation study suggest that this procedure has reasonable statistical properties. Estimators (of means or totals) are unbiased with variances of comparable size to the corresponding ratio estimators.
    Release date: 1978-12-15
Reference (0)

Reference (0) (0 results)

No content available at this time.

Date modified: