Comparison of unit level and area level small area estimators 1. Introduction
Model-based small area estimators have been widely used in practice to provide reliable indirect estimates for small areas in recent years. The model-based estimators are based on explicit models that provide a link to related small areas through supplementary data such as census and administrative records. Small area models can be classified into two broad types: (i) Unit level models that relate the unit values of the study variable to unit-specific auxiliary variables and (ii) Area level models that relate direct estimators of the study variable of the small area to the corresponding area-specific auxiliary variables. In general, area level models are used to improve the direct estimators if unit level data are not available. The sampling set-up is as in Rao (2003). That is, a universe of size is split into non-overlapping small areas of size where Sampling is carried out in each small area using a probabilistic mechanism, resulting in samples of size The selection probabilities associated with each element selected in sample is denoted as The resulting design weights are given by In practice, these weights can be adjusted to account for non-response and/or auxiliary information. The resulting weights are known as the survey weights. In this paper, we assume full response to the survey, and no adjustment to the auxiliary data. Direct area level estimates are obtained for each area using the survey weights and unit observations from the area. The survey design can be incorporated into small area models in different ways. In the area level case, direct design-based estimators are modeled directly and the survey variance of the associated direct estimator is introduced into the model via the design-based errors. In the case of the unit level, the observations can be weighted using the survey weight. A number of factors affect the success of using these estimators. Two important factors are whether the assumed model is correct and whether the variable of interest is correlated with the selection probabilities associated with the sampling process, that is, informativeness of the sampling process. In this paper, we compare, via a simulation study, the impact of model misspecification and the informativeness of the sampling design for two basic small area procedures based on unit and area levels in terms of bias, estimated mean squared error and confidence interval coverage rates. A sampling design is informative if the selection probabilities are related to the variable of interest even after conditioning on the covariates In such cases, we have informative sampling in the sense that the population model no longer holds for the sample. Pfeffermann and Sverchkov (2007) accounted for this possibility by adjusting the small area procedures. Verret, Rao and Hidiroglou (2015) simplified the procedure. In this paper, we do not adjust the small area procedures for informativeness, but study their impact.
The paper is structured as follows. The point estimators and associated mean squared error estimators for the unit level and area models are described in Section 2 and in Section 3 respectively. The description of the simulation and results are given in Section 4. This simulation computes the point and associated mean squared errors for a probability proportional to size with replacement (PPSWR) sampling scheme by varying the following two factors: (a) the assumed model is correct or incorrect; and (b) design informativeness varies from being non-significant to being very significant. In Section 5, we give an example using data from Battese, Harter and Fuller (1988) that compares the unit level and area level estimates. Finally, conclusions resulting from this work are presented in Section 6.
- Date modified: