Weighting and estimation

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Survey or statistical program

1 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (21)

All (21) (0 to 10 of 21 results)

  • Articles and reports: 88F0006X1999009
    Description:

    This working paper presents the estimation procedures used to calculate the research and development (R&D) expenditures in the higher education sector for the year 1979-80 to 1997-98.

    Release date: 1999-12-24

  • Surveys and statistical programs – Documentation: 92-371-X
    Description:

    This report deals with sampling and weighting, a process whereby certain characteristics are collected and processed for a random sample of dwellings and persons identified in the complete census enumeration. Data for the whole population are then obtained by scaling up the results for the sample to the full population level. The use of sampling may lead to substantial reductions in costs and respondent burden, or alternatively, can allow the scope of a census to be broadened at the same cost.

    Release date: 1999-12-07

  • Surveys and statistical programs – Documentation: 11-522-X19980015017
    Description:

    Longitudinal studies with repeated observations on individuals permit better characterizations of change and assessment of possible risk factors, but there has been little experience applying sophisticated models for longitudinal data to the complex survey setting. We present results from a comparison of different variance estimation methods for random effects models of change in cognitive function among older adults. The sample design is a stratified sample of people 65 and older, drawn as part of a community-based study designed to examine risk factors for dementia. The model summarizes the population heterogeneity in overall level and rate of change in cognitive function using random effects for intercept and slope. We discuss an unweighted regression including covariates for the stratification variables, a weighted regression, and bootstrapping; we also did preliminary work into using balanced repeated replication and jackknife repeated replication.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015019
    Description:

    The British Labour Force Survey (LFS) is a quarterly household survey with a rotating sample design that can potentially be used to produce longitudinal data, including estimates of labour force gross flows. However, these estimates may be biased due to the effect of non-response. Weighting adjustments are a commonly used method to account for non-response bias. We find that weighting may not fully account for the effect of non-response bias because non-response may depend on the unobserved labour force flows, i.e., the non-response is non-ignorable. To adjust for the effects of non-ignorable non-response, we propose a model for the complex non-response patterns in the LFS which controls for the correlated within-household non-response behaviour found in the survey. The results of modelling suggest that non-response may be non-ignorable in the LFS, causing the weighting estimates to be biased.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015020
    Description:

    At the end of 1993, Eurostat lauched a 'community' panel of households. The first wave, carried out in 1994 in the 12 countries of the European Union, included some 7,300 households in France, and at least 14,000 adults 17 years or over. Each individual was then followed up and interviewed each year, even if they had moved. The individuals leaving the sample present a particular profile. In the first part, we present a sketch of how our sample evolves and an analysis of the main characteristics of the non-respondents. We then propose 2 models to correct for non-response per homogeneous category. We then describe the longitudinal weight distribution obtained from the two models, and the cross-sectional weights using the weight share method. Finally, we compare some indicators calculated using both weighting methods.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015023
    Description:

    The study of social mobility, between labour market statuses or between income levels, for example, is often based on the analysis of mobility matrices. When comparing these transition matrices, with a view to evaluating behavioural changes, one often forgets that the data derive from a sample survey and are therefore affected by sampling variances. Similarly, it is assumed that the responses collected correspond to the ' true value.'

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015026
    Description:

    The purpose of the present study is to utilize panel data from the Current Population Survey (CPS) to examine the effects of unit nonresponse. Because most nonrespondents to the CPS are respondents during at least one month-in-sample, data from other months can be used to compare the characteristics of complete respondents and panel nonrespondents and to evaluate nonresponse adjustment procedures. In the current paper we present analyses utilizing CPS panel data to illustrate the effects of unit nonresponse. After adjusting for nonresponse, additional comparisons are also made to evaluate the effects of nonresponse adjustment. The implications of the findings and suggestions for further research are discussed.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015028
    Description:

    We address the problem of estimation for the income dynamics statistics calculated from complex longitudinal surveys. In addition, we compare two design-based estimators of longitudinal proportions and transition rates in terms of variability under large attrition rates. One estimator is based on the cross-sectional samples for the estimation of the income class boundaries at each time period and on the longitudinal sample for the estimation of the longitudinal counts; the other estimator is entirely based on the longitudinal sample, both for the estimation of the class boundaries and the longitudinal counts. We develop Taylor linearization-type variance estimators for both the longitudinal and the mixed estimator under the assumption of no change in the population, and for the mixed estimator when there is change.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015031
    Description:

    The U.S. Third National Health and Nutrition Examination Survey (NHANES III) was carried out from 1988 to 1994. This survey was intended primarily to provide estimates of cross-sectional parameters believed to be approximately constant over the six-year data collection period. However, for some variable (e.g., serum lead, body mass index and smoking behavior), substantive considerations suggest the possible presence of nontrivial changes in level between 1988 and 1994. For these variables, NHANES III is potentially a valuable source of time-change information, compared to other studies involving more restricted populations and samples. Exploration of possible change over time is complicated by two issues. First, there was of practical concern because some variables displayed substantial regional differences in level. This was of practical concern because some variables displayed substantial regional differences in level. Second, nontrivial changes in level over time can lead to nontrivial biases in some customary NHANES III variance estimators. This paper considers these two problems and discusses some related implications for statistical policy.

    Release date: 1999-10-22

  • Articles and reports: 12-001-X19990014711
    Description:

    We consider the use of calibration estimators when outliers occur. An extension is obtained for the class of Deville and Särndal (1992) calibration estimators based on Wright (1983) QR estimators. It is also obtained by minimizing a general metric subject to constraints on the calibration variables and weights. As an application, this class of estimators helps us consider robust calibration estimators by choosing parameters carefully. This makes it possible, e.g., for cosmetic reasons, to limit robuts weights to a predetermined interval. The use of robust estimators with a high breakdown point is also considered. In the specific case of the mean square metric, the estimator proposed by the author is a generalization of a Lee (1991) proposition. The new methodology is illustrated by means of a short simulation study.

    Release date: 1999-10-08
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (13)

Analysis (13) (0 to 10 of 13 results)

  • Articles and reports: 88F0006X1999009
    Description:

    This working paper presents the estimation procedures used to calculate the research and development (R&D) expenditures in the higher education sector for the year 1979-80 to 1997-98.

    Release date: 1999-12-24

  • Articles and reports: 12-001-X19990014711
    Description:

    We consider the use of calibration estimators when outliers occur. An extension is obtained for the class of Deville and Särndal (1992) calibration estimators based on Wright (1983) QR estimators. It is also obtained by minimizing a general metric subject to constraints on the calibration variables and weights. As an application, this class of estimators helps us consider robust calibration estimators by choosing parameters carefully. This makes it possible, e.g., for cosmetic reasons, to limit robuts weights to a predetermined interval. The use of robust estimators with a high breakdown point is also considered. In the specific case of the mean square metric, the estimator proposed by the author is a generalization of a Lee (1991) proposition. The new methodology is illustrated by means of a short simulation study.

    Release date: 1999-10-08

  • Articles and reports: 12-001-X19990014712
    Description:

    This paper investigates a repeated sampling approach to take into account auxiliary information in order to improve the precision of estimators. The objective is to build an estimator with a small conditional bias by weighting the observed values by the inverses of the conditional inclusion probabilities. A general approximation is proposed in cases when the auxiliary static is a vector of Horvitz-Thompson estimators. This approximation is quite close to the optimal estimator discussed by Fuller and Isaki (1981), Montanari (1987, 1997), Deville (1992) and Rao (1994, 1997). Next, the optimal estimator is applied to a stratified sampling design and it is shown that the optimal estimator can be viewed as a generalised regression estimator for which the stratification indicator variables are also used at the estimation stage. Finally, the application field of this estimator is discussed in the general context of the use of auxiliary information.

    Release date: 1999-10-08

  • Articles and reports: 12-001-X19990014713
    Description:

    Robust small area estimation is studied under a simple random effects model consisting of a basic (or fixed effects) model and a linking model that treats the fixed effects as realizations of a random variable. Under this model a model-assisted estimator of a small area mean is obtained. This estimator depends on the survey weights and remains design-consistent. A model-based estimator of its mean squared error (MSE) is also obtained. Simulation results suggest that the proposed estimator and Kott's (1989) model-assisted estimator are equally efficient, and that the proposed MSE estimator is often much more stable than Kott's MSE estimator, even under moderate deviations of the linking model. The method is also extended to nested error regression models.

    Release date: 1999-10-08

  • Articles and reports: 12-001-X19990014715
    Description:

    The Gallup Organization has been conducting household surveys to study state-wide prevalences of alcohol and drug (e.g., cocaine, marijuana, etc.) use. Traditional design-based survey estimates of use and dependence for counties and select demographic groups have unacceptably large standard errors because sample sizes in sub-state groups are two small. Synthetic estimation incorporates demographic information and social indicators in estimates of prevalence through an implicit regression model. Synthetic estimates tend to have smaller variances than design-based estimates, but can be very homogeneous across counties when auxiliary variables are homogeneous. Composite estimates for small areas are weighted averages of design-based survey estimates and synthetic estimates. A second problem generally not encountered at the state level but present for sub-state areas and groups concerns estimating standard errors of estimated prevalences that are close to zero. This difficulty affects not only telephone household survey estimates, but also composite estimates. A hierarchical model is proposed to address this problem. Empirical Bayes composite estimators, which incorporate survey weights, of prevalences and jackknife estimators of their mean squared errors are presented and illustrated.

    Release date: 1999-10-08

  • Articles and reports: 12-001-X19990014716
    Description:

    Two design-based estimators of gross flows and transition rates are considered. One makes use of the cross-sectional samples for the estimation of the income class boundaries at each time period and the longitudinal sample for the estimation of counts of units in the longitudinal population (longitudinal counts); this is the mixed estimator. The other one is entirely based on the longitudinal sample, both for the estimation of the class boundaries and the longitudinal counts; this is the longitudinal estimator. We compare the two estimators in the presence of large attrition rates, by means of a simulation. We find that under a less than perfect model of compensation for attrition, the mixed estimator is usually more sensitive to model bias than the longitudinal estimator. Furthermore, we find that for the mixed estimator, the magnitude of this bias overshadows the small gain in precision when compared to the longitudinal estimator. The results are illustrated with data from the Survey of Labour and Income Dynamics and the Longitudinal Administrative Database of Statistics Canada.

    Release date: 1999-10-08

  • Articles and reports: 12-001-X19990014718
    Description:

    In this short note, we demonstrate that the well-known formula for the design effect intuitively proposed by Kish has a model-based justification. The formula can be interpreted as a conservative value for the actual design effect.

    Release date: 1999-10-08

  • Articles and reports: 12-001-X19980024348
    Description:

    Gross flows among labour force states are of great importance in understanding labour market dynamics. Observed flows are typically subject to classification errors, which may induce serious bias. In this paper, some of the most common strategies, used to collect longitudinal information about labour force condition are reviewed, jointly with the modelling approaches developed to correct gross flows, when affected by classification errors. A general framework for estimating gross flows is outlined. Examples are given of different model specifications, applied to data collected with different strategies. Specifically, two cases are considered, i.e., gross flows from (i) the U.S. Survey of Income and Program Participation and (ii) the French Labour Force Survey, a yearly survey collecting retrospective monthly information.

    Release date: 1999-01-14

  • Articles and reports: 12-001-X19980024349
    Description:

    Measurement of gross flows in labour force status is an important objective of the continuing labour force surveys carried out by many national statistics agencies. However, it is well known that estimation of these flows can be complicated by nonresponse, measurement errors, sample rotation and complex design effects. Motivated by nonresponse patterns in household-based surveys, this paper focuses on estimation of labour force gross flows, while simultaneously adjusting for nonignorable nonresponse. Previous model-based approaches to gross flows estimation have assumed nonresponse to be an individual-level process. We propose a class of models that allow for nonignorable household-level nonresponse. A simulation study is used to show, that individual-level labour force gross flows estimates from household-based survey data, may be biased and that estimates using household-level models can offer a reduction in this bias.

    Release date: 1999-01-14

  • Articles and reports: 12-001-X19980024353
    Description:

    This paper studies response errors in the Current Population Survey of the U.S. Bureau of the Census and assesses their impact on the unemployment rates published by the Bureau of Labour Statistics. The measurement of these error rates is obtained from reinterview data, using an extension of the Hui and Walter (1980) procedure for the evaluation of diagnostic tests. Unlike prior studies which assumed that the reconciled reinterview yields the true status, the method estimates the error rates in both interviews. Using these estimated error rates, we show that the misclassification in the original survey creates a cyclical effect on the reported estimated unemployment rates. In particular, the degress of underestimation increases when true unemployment is high. As there was insufficient data to distinguish between a model assuming that the misclassification rates are the same throughout the business cycle, and one that allows the error rates to differ in periods of low, moderate and high unemployment, our findings should be regarded as preliminary. Nonetheless, they indicated that the relationship between the models used to assess the accuracy of diagnostic tests, and those measuring misclassification rates of survey data, deserves further study.

    Release date: 1999-01-14
Reference (8)

Reference (8) ((8 results))

  • Surveys and statistical programs – Documentation: 92-371-X
    Description:

    This report deals with sampling and weighting, a process whereby certain characteristics are collected and processed for a random sample of dwellings and persons identified in the complete census enumeration. Data for the whole population are then obtained by scaling up the results for the sample to the full population level. The use of sampling may lead to substantial reductions in costs and respondent burden, or alternatively, can allow the scope of a census to be broadened at the same cost.

    Release date: 1999-12-07

  • Surveys and statistical programs – Documentation: 11-522-X19980015017
    Description:

    Longitudinal studies with repeated observations on individuals permit better characterizations of change and assessment of possible risk factors, but there has been little experience applying sophisticated models for longitudinal data to the complex survey setting. We present results from a comparison of different variance estimation methods for random effects models of change in cognitive function among older adults. The sample design is a stratified sample of people 65 and older, drawn as part of a community-based study designed to examine risk factors for dementia. The model summarizes the population heterogeneity in overall level and rate of change in cognitive function using random effects for intercept and slope. We discuss an unweighted regression including covariates for the stratification variables, a weighted regression, and bootstrapping; we also did preliminary work into using balanced repeated replication and jackknife repeated replication.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015019
    Description:

    The British Labour Force Survey (LFS) is a quarterly household survey with a rotating sample design that can potentially be used to produce longitudinal data, including estimates of labour force gross flows. However, these estimates may be biased due to the effect of non-response. Weighting adjustments are a commonly used method to account for non-response bias. We find that weighting may not fully account for the effect of non-response bias because non-response may depend on the unobserved labour force flows, i.e., the non-response is non-ignorable. To adjust for the effects of non-ignorable non-response, we propose a model for the complex non-response patterns in the LFS which controls for the correlated within-household non-response behaviour found in the survey. The results of modelling suggest that non-response may be non-ignorable in the LFS, causing the weighting estimates to be biased.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015020
    Description:

    At the end of 1993, Eurostat lauched a 'community' panel of households. The first wave, carried out in 1994 in the 12 countries of the European Union, included some 7,300 households in France, and at least 14,000 adults 17 years or over. Each individual was then followed up and interviewed each year, even if they had moved. The individuals leaving the sample present a particular profile. In the first part, we present a sketch of how our sample evolves and an analysis of the main characteristics of the non-respondents. We then propose 2 models to correct for non-response per homogeneous category. We then describe the longitudinal weight distribution obtained from the two models, and the cross-sectional weights using the weight share method. Finally, we compare some indicators calculated using both weighting methods.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015023
    Description:

    The study of social mobility, between labour market statuses or between income levels, for example, is often based on the analysis of mobility matrices. When comparing these transition matrices, with a view to evaluating behavioural changes, one often forgets that the data derive from a sample survey and are therefore affected by sampling variances. Similarly, it is assumed that the responses collected correspond to the ' true value.'

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015026
    Description:

    The purpose of the present study is to utilize panel data from the Current Population Survey (CPS) to examine the effects of unit nonresponse. Because most nonrespondents to the CPS are respondents during at least one month-in-sample, data from other months can be used to compare the characteristics of complete respondents and panel nonrespondents and to evaluate nonresponse adjustment procedures. In the current paper we present analyses utilizing CPS panel data to illustrate the effects of unit nonresponse. After adjusting for nonresponse, additional comparisons are also made to evaluate the effects of nonresponse adjustment. The implications of the findings and suggestions for further research are discussed.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015028
    Description:

    We address the problem of estimation for the income dynamics statistics calculated from complex longitudinal surveys. In addition, we compare two design-based estimators of longitudinal proportions and transition rates in terms of variability under large attrition rates. One estimator is based on the cross-sectional samples for the estimation of the income class boundaries at each time period and on the longitudinal sample for the estimation of the longitudinal counts; the other estimator is entirely based on the longitudinal sample, both for the estimation of the class boundaries and the longitudinal counts. We develop Taylor linearization-type variance estimators for both the longitudinal and the mixed estimator under the assumption of no change in the population, and for the mixed estimator when there is change.

    Release date: 1999-10-22

  • Surveys and statistical programs – Documentation: 11-522-X19980015031
    Description:

    The U.S. Third National Health and Nutrition Examination Survey (NHANES III) was carried out from 1988 to 1994. This survey was intended primarily to provide estimates of cross-sectional parameters believed to be approximately constant over the six-year data collection period. However, for some variable (e.g., serum lead, body mass index and smoking behavior), substantive considerations suggest the possible presence of nontrivial changes in level between 1988 and 1994. For these variables, NHANES III is potentially a valuable source of time-change information, compared to other studies involving more restricted populations and samples. Exploration of possible change over time is complicated by two issues. First, there was of practical concern because some variables displayed substantial regional differences in level. This was of practical concern because some variables displayed substantial regional differences in level. Second, nontrivial changes in level over time can lead to nontrivial biases in some customary NHANES III variance estimators. This paper considers these two problems and discusses some related implications for statistical policy.

    Release date: 1999-10-22
Date modified: