Weighting and estimation

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Type

1 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (15)

All (15) (0 to 10 of 15 results)

  • Articles and reports: 88F0006X1997013
    Description:

    Statistics Canada is engaged in a project "Information System for Science and Technology" which purpose is to develop useful indicators of activity and a framework to tie them together into a coherent picture of science and technology (S&T) in Canada. The Working papers series is used to publish results of the different initiatives conducted within this project. The produced data are related to the activities, linkages and outcomes of S&T. Several key areas are covered such as: innovation, technology diffusion, human resources in S&T and interrelations between different actors involved in S&T. This series also presents important data tabulations taken from regular surveys on R&D and S&T and made possible because of the existing Project.

    Release date: 1998-09-25

  • Articles and reports: 12-001-X19980013904
    Description:

    Many economic and agricultural surveys are multi-purpose. It would be convenient if one could stratify the target population of such a survey in a number of different purposes and then combine the samples for enumeration. We explore four different sampling methods that select similar samples across all stratifications thereby reducing the overall sample size. Data from an agriculture survey is used to evaluate the effectiveness of these alternative sampling strategies. We then show how a calibration (i.e., reweighted) estimator can increase statistical efficiency by capturing what is known about the original stratum sizes in the estimation. Raking, which has been suggested in the literature for this purpose, is simply one method of calibration.

    Release date: 1998-07-31

  • Articles and reports: 12-001-X19980013905
    Description:

    Two-phase sampling designs offer a variety of possibilities for use of auxiliary information. We begin by reviewing the different forms that auxiliary information may take in two-phase surveys. We then set up the procedure by which this information is transformed into calibrated weights, which we use to construct efficient estimators of a population total. The calibration is done in two steps: (i) at the population level; (ii) at the level of the first-phase sample. We go on to show that the resulting calibration estimators are also derivable via regression fitting in two steps. We examine these estimators for a special case of interest, namely, when auxiliary information is available for population subgroups called calibration groups. Postrata are the simplest example of such groups. Estimation for domains of interest and variance estimation are also discussed. These results are illustrated by applying them to two-phase designs at Statistics Canada. The general theory for using auxiliary information in two-phase sampling is being incorporated into Statistics Canada's Generalized Estimation System.

    Release date: 1998-07-31

  • Articles and reports: 12-001-X19980013906
    Description:

    In sample surveys, the units contained in the sampling frame ideally have a one-to-one correspondence with the elements in the target population under study. In many cases, however, the frame has a many-to-many structure. That is, a unit in the frame may be associated with multiple target population elements and a target population element may be associated with multiple frame units. Such was the case in a building characteristics survey in which the frame was a list of street addresses, but the target population was commercial buildings. The frame was messy because a street address corresponded either to a single building, multiple buildings, or part of a building. In this paper, we develop estimators and formulas for their variances in both simple and stratified random sampling designs when the frame has a many-to-many structure.

    Release date: 1998-07-31

  • Articles and reports: 12-001-X19980013907
    Description:

    Least squares estimation for repeated surveys is addressed. Several estimators of current level, change in level and average level for multiple time periods are developed. The Recursive Regression Estimator, a recursive computational form of the best linear unbiased estimator based on all periods of the survey, is presented. It is shown that the recursive regression procedure converges; and that the dimension of the estimation problem is bounded as the number of periods increases indefinitely. The recursive procedure offers a solution to the problem of computational complexity associated with minimum variance unbiased estimation in repeated surveys. Data from the U.S. Current Population Survey are used to compare alternative estimators under two types of rotation designs: the intermittent rotation design used in the U.S. Current Population Survey, and two continuous rotation designs.

    Release date: 1998-07-31

  • Articles and reports: 12-001-X19980013908
    Description:

    In the present investigation, the problem of estimation of variance of the general linear regression estimator has been considered. It has been shown that the efficiency of the low level calibration approach adopted by Särndal (1996) is less than or equal to that of a class of estimators proposed by Deng and Wu (1987). A higher level calibration approach has also been suggested. The efficiency of higher level calibration approach is shown to improve on the original approach. Several estimators are shown to be the special cases of this proposed higher level calibration approach. An idea to find a non-negative estimate of variance of the GREG has been suggested. Results have been extended to a stratified random sampling design. An empirical study has also been carried out to study the performance of the proposed strategies. The well known statistical package, GES, developed at Statistics Canada can further be improved to obtain better estimates of variance of GREG using the proposed higher level calibration approach under certain circumstances discussed in this paper.

    Release date: 1998-07-31

  • Articles and reports: 12-001-X19980013909
    Description:

    In this paper we study the model-assisted estimation of class frequencies of a discrete response variable by a new survey estimation method, which is closely related to generalized regression estimation. In generalized regression estimation the available auxiliary data are incorporated in the estimation procedure by a linear model fit. Instead of using a linear model for the class indicators, we describe the joint distribution of the class indicators by a multinomial logistic model. Logistic generalized regression estimators are introduced for class frequencies in a population and domains. Monte Carlo experiments were carried out for simulated data and for real data taken from the Labour Force Survey conducted monthly by Statistics Finland. The logistic generalized regression estimation yielded better results than the ordinary regression estimation for small domains and particularly for small class frequencies.

    Release date: 1998-07-31

  • Articles and reports: 12-001-X19980013910
    Description:

    Let A be a population domain of interest and assume that the elements of A cannot be identified on the sampling frame and the number of elements in A is not known. Further assume that a sample of fixed size (say n) is selected from the entire frame and the resulting domain sample size (say n_A) is random. The problem addressed is the construction of a confidence interval for a domain parameter such as the domain aggregate T_A = \sum_{i \in A} x_i. The usual approach to this problem is to redefine x_i, by setting x_i = 0 if i \notin A. Thus, the construction of a confidence interval for the domain total is recast as the construction of a confidence interval for a population total which can be addressed (at least asymptotically in n) by normal theory. As an alternative, we condition on n_A and construct confidence intervals which have approximately nominal coverage under certain assumptions regarding the domain population. We evaluate the new approach empirically using artificial populations and data from the Bureau of Labor Statistics (BLS) Occupational Compensation Survey.

    Release date: 1998-07-31

  • Articles and reports: 12-001-X19980013911
    Description:

    This paper examines the main properties of the generalized regression estimator of a finite population mean and those of the regression estimator obtained from the optimal difference estimator. Given that the latter can be more efficient than the former, conditions allowing this to happen are established, and a criterion for choosing between the two types of regression estimators follows. A simulation study illustrates their finite sample performances.

    Release date: 1998-07-31

  • Articles and reports: 12-001-X19980013912
    Description:

    Efficient estimates of population size and totals based on information from multiple list frames and an independent area frame are considered. This work is an extension of the methodology proposed by Harley (1962) which considers two general frames. A main disadvantage of list frames is that they are typically incomplete. In this paper, we propose several methods to address frame deficiencies. A joint list-area sampling design incorporates multiple frames and achieves full coverage of the target population. For each combination of frames, we present the appropriate notation, likelihood function, and parameter estimators. Results from a simulation study that compares the various properties of the proposed estimators are also presented.

    Release date: 1998-07-31
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (15)

Analysis (15) (0 to 10 of 15 results)

  • Articles and reports: 88F0006X1997013
    Description:

    Statistics Canada is engaged in a project "Information System for Science and Technology" which purpose is to develop useful indicators of activity and a framework to tie them together into a coherent picture of science and technology (S&T) in Canada. The Working papers series is used to publish results of the different initiatives conducted within this project. The produced data are related to the activities, linkages and outcomes of S&T. Several key areas are covered such as: innovation, technology diffusion, human resources in S&T and interrelations between different actors involved in S&T. This series also presents important data tabulations taken from regular surveys on R&D and S&T and made possible because of the existing Project.

    Release date: 1998-09-25

  • Articles and reports: 12-001-X19980013904
    Description:

    Many economic and agricultural surveys are multi-purpose. It would be convenient if one could stratify the target population of such a survey in a number of different purposes and then combine the samples for enumeration. We explore four different sampling methods that select similar samples across all stratifications thereby reducing the overall sample size. Data from an agriculture survey is used to evaluate the effectiveness of these alternative sampling strategies. We then show how a calibration (i.e., reweighted) estimator can increase statistical efficiency by capturing what is known about the original stratum sizes in the estimation. Raking, which has been suggested in the literature for this purpose, is simply one method of calibration.

    Release date: 1998-07-31

  • Articles and reports: 12-001-X19980013905
    Description:

    Two-phase sampling designs offer a variety of possibilities for use of auxiliary information. We begin by reviewing the different forms that auxiliary information may take in two-phase surveys. We then set up the procedure by which this information is transformed into calibrated weights, which we use to construct efficient estimators of a population total. The calibration is done in two steps: (i) at the population level; (ii) at the level of the first-phase sample. We go on to show that the resulting calibration estimators are also derivable via regression fitting in two steps. We examine these estimators for a special case of interest, namely, when auxiliary information is available for population subgroups called calibration groups. Postrata are the simplest example of such groups. Estimation for domains of interest and variance estimation are also discussed. These results are illustrated by applying them to two-phase designs at Statistics Canada. The general theory for using auxiliary information in two-phase sampling is being incorporated into Statistics Canada's Generalized Estimation System.

    Release date: 1998-07-31

  • Articles and reports: 12-001-X19980013906
    Description:

    In sample surveys, the units contained in the sampling frame ideally have a one-to-one correspondence with the elements in the target population under study. In many cases, however, the frame has a many-to-many structure. That is, a unit in the frame may be associated with multiple target population elements and a target population element may be associated with multiple frame units. Such was the case in a building characteristics survey in which the frame was a list of street addresses, but the target population was commercial buildings. The frame was messy because a street address corresponded either to a single building, multiple buildings, or part of a building. In this paper, we develop estimators and formulas for their variances in both simple and stratified random sampling designs when the frame has a many-to-many structure.

    Release date: 1998-07-31

  • Articles and reports: 12-001-X19980013907
    Description:

    Least squares estimation for repeated surveys is addressed. Several estimators of current level, change in level and average level for multiple time periods are developed. The Recursive Regression Estimator, a recursive computational form of the best linear unbiased estimator based on all periods of the survey, is presented. It is shown that the recursive regression procedure converges; and that the dimension of the estimation problem is bounded as the number of periods increases indefinitely. The recursive procedure offers a solution to the problem of computational complexity associated with minimum variance unbiased estimation in repeated surveys. Data from the U.S. Current Population Survey are used to compare alternative estimators under two types of rotation designs: the intermittent rotation design used in the U.S. Current Population Survey, and two continuous rotation designs.

    Release date: 1998-07-31

  • Articles and reports: 12-001-X19980013908
    Description:

    In the present investigation, the problem of estimation of variance of the general linear regression estimator has been considered. It has been shown that the efficiency of the low level calibration approach adopted by Särndal (1996) is less than or equal to that of a class of estimators proposed by Deng and Wu (1987). A higher level calibration approach has also been suggested. The efficiency of higher level calibration approach is shown to improve on the original approach. Several estimators are shown to be the special cases of this proposed higher level calibration approach. An idea to find a non-negative estimate of variance of the GREG has been suggested. Results have been extended to a stratified random sampling design. An empirical study has also been carried out to study the performance of the proposed strategies. The well known statistical package, GES, developed at Statistics Canada can further be improved to obtain better estimates of variance of GREG using the proposed higher level calibration approach under certain circumstances discussed in this paper.

    Release date: 1998-07-31

  • Articles and reports: 12-001-X19980013909
    Description:

    In this paper we study the model-assisted estimation of class frequencies of a discrete response variable by a new survey estimation method, which is closely related to generalized regression estimation. In generalized regression estimation the available auxiliary data are incorporated in the estimation procedure by a linear model fit. Instead of using a linear model for the class indicators, we describe the joint distribution of the class indicators by a multinomial logistic model. Logistic generalized regression estimators are introduced for class frequencies in a population and domains. Monte Carlo experiments were carried out for simulated data and for real data taken from the Labour Force Survey conducted monthly by Statistics Finland. The logistic generalized regression estimation yielded better results than the ordinary regression estimation for small domains and particularly for small class frequencies.

    Release date: 1998-07-31

  • Articles and reports: 12-001-X19980013910
    Description:

    Let A be a population domain of interest and assume that the elements of A cannot be identified on the sampling frame and the number of elements in A is not known. Further assume that a sample of fixed size (say n) is selected from the entire frame and the resulting domain sample size (say n_A) is random. The problem addressed is the construction of a confidence interval for a domain parameter such as the domain aggregate T_A = \sum_{i \in A} x_i. The usual approach to this problem is to redefine x_i, by setting x_i = 0 if i \notin A. Thus, the construction of a confidence interval for the domain total is recast as the construction of a confidence interval for a population total which can be addressed (at least asymptotically in n) by normal theory. As an alternative, we condition on n_A and construct confidence intervals which have approximately nominal coverage under certain assumptions regarding the domain population. We evaluate the new approach empirically using artificial populations and data from the Bureau of Labor Statistics (BLS) Occupational Compensation Survey.

    Release date: 1998-07-31

  • Articles and reports: 12-001-X19980013911
    Description:

    This paper examines the main properties of the generalized regression estimator of a finite population mean and those of the regression estimator obtained from the optimal difference estimator. Given that the latter can be more efficient than the former, conditions allowing this to happen are established, and a criterion for choosing between the two types of regression estimators follows. A simulation study illustrates their finite sample performances.

    Release date: 1998-07-31

  • Articles and reports: 12-001-X19980013912
    Description:

    Efficient estimates of population size and totals based on information from multiple list frames and an independent area frame are considered. This work is an extension of the methodology proposed by Harley (1962) which considers two general frames. A main disadvantage of list frames is that they are typically incomplete. In this paper, we propose several methods to address frame deficiencies. A joint list-area sampling design incorporates multiple frames and achieves full coverage of the target population. For each combination of frames, we present the appropriate notation, likelihood function, and parameter estimators. Results from a simulation study that compares the various properties of the proposed estimators are also presented.

    Release date: 1998-07-31
Reference (0)

Reference (0) (0 results)

No content available at this time.

Date modified: