1 Introduction

Yong You, J.N.K. Rao and Mike Hidiroglou

Previous | Next

Subpopulations or domains are called small areas when the domain sample sizes are not large enough to provide reliable area-specific direct estimates of domain parameters. In those cases, it becomes necessary to use model-based indirect estimators that make use of sample data from related areas through linking models to achieve significant gain in efficiency over direct estimators. In this paper we focus on a basic area level model, called the Fay-Herriot (FH) model (Fay and Herriot 1979), and associated empirical best linear unbiased predictors (EBLUPs) of small area means.

Those EBLUPs do not necessarily agree with direct estimators for aggregates which are preferred in practice because they are based on large enough sample sizes to satisfy reliability requirements and do not depend on models. As a result, benchmarking is often done to force agreement. In this paper, we focus on two methods of self benchmarking, based on modifications to the EBLUPs. The first method (WFQ), due to Wang, Fuller and Qu (Wang et al. 2008), uses an augmented FH model and the associated EBLUPs, denoted as WFQ estimators. The second method (YR), also proposed by Wang, Fuller and Qu, is based on an approach used by You and Rao (2002) in the context of unit level models. The YR estimators are obtained by modifying the optimal estimators of the regression parameters used in the EBLUPs to force agreement. Because of the modifications to the EBLUPs, benchmarked estimators WFQ and YR will have higher mean squared predication errors (MSPEs).

This paper has two main objectives. The first objective is to compare MSPEs and their estimators for the YR and WFQ estimators of small area means. Section 2 reviews the expressions for MSPEs and associated estimators for the EBLUP and the WFQ estimators. Section 3 develops expressions for MSPE and its estimator for the YR estimator. Section 4 compares the MSPEs and their estimators in a simulation study.

The second objective of our paper is to examine the performance of the WFQ and YR estimators and their MSPE estimators when the linking model in the FH model is misspecified due to an omitted variable. WFQ also studied the effect of misspecification of the linking model for a particular example, for which they showed that the YR estimator leads to large bias whereas the WFQ estimator did not. However, this result was due to the fact that in their simulation study, the augmenting variable was highly correlated with an omitted covariate (correlation coefficient ρ= MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdHiVc=bYP0xb9sq=fFjea0RXxb9qr0dd9q8qi0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaeqyWdi Naeyypa0daaa@3C0E@  0.983). As a result, the augmented model used was in fact close to the true model, leading to the superior performance for the WFQ estimator. In Section 4, we consider a different omitted variable that is weakly correlated with the augmenting variable (ρ= MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9LqFf0x e9q8qqvqFr0dXdHiVc=bYP0xb9sq=fFjea0RXxb9qr0dd9q8qi0lf9 Fve9Fve9vapdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaaiikai abeg8aYjabg2da9aaa@3CBA@  -0.116) so that the augmented model is also misspecified. We compare the biases, MSPEs and their estimators for the EBLUP, YR and WFQ estimators obtained under the misspecified model.

Previous | Next

Date modified: