Comparison of some positive variance estimators for the Fay-Herriot small area model 1. Introduction

The Fay-Herriot model (Fay and Herriot 1979) is a basic area level model used to estimate small area means, when available direct survey estimates are imprecise due to small sample sizes. In this model, the small area mean is represented by a non-random linear term in the covariates, plus a random area effect. The best linear unbiased prediction (BLUP) estimator of a small area mean, under the Fay-Herriot model, can be obtained by minimizing the mean squared error (MSE) among the class of linear unbiased estimators. The BLUP is a weighted average of the direct survey estimator and the regression-synthetic estimator, with weights depending on the variance of the random area effects, σ v 2 . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpipeea0xe9Lq=Je9 vqaqFeFr0xbbG8FaYPYRWFb9fi0FXxbbf9Ff0dfrpm0dXdHqVu0=vr 0=vr0=fdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaeq4Wdm3aa0 baaSqaaiaadAhaaeaacaaIYaaaaOGaaiOlaaaa@3B8F@  Usually, this variance has to be estimated from the data under the Fay-Herriot model. The empirical best linear unbiased (EBLUP) estimator of the small area mean is obtained by replacing the variance in the formula of the BLUP with an estimate. There are many well-known methods of variance estimation used in this context but the variance estimator used most often is the restricted maximum likelihood (REML) estimator because it accounts for the loss of degrees of freedom due to estimating the regression coefficient. Furthermore, it is unbiased up to the second order, and it also converges faster in terms of the number of iterations. Despite these important characteristics, occasionally, and particularly when the number of areas, m , MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpipeea0xe9Lq=Je9 vqaqFeFr0xbbG8FaYPYRWFb9fi0FXxbbf9Ff0dfrpm0dXdHqVu0=vr 0=vr0=fdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamyBaiaacY caaaa@38CE@  is small or moderate, the REML method yields a zero variance estimate. This implies zero weight to the direct survey estimator in the EBLUP formula and hence the EBLUP estimator becomes a regression-synthetic estimator. However, most practitioners are reluctant to use synthetic estimators for small area means, since these ignore the survey based information and are often quite biased. When dealing with real data sets, for which models are never perfect, a positive estimate for σ v 2 MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpipeea0xe9Lq=Je9 vqaqFeFr0xbbG8FaYPYRWFb9fi0FXxbbf9Ff0dfrpm0dXdHqVu0=vr 0=vr0=fdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaeq4Wdm3aa0 baaSqaaiaadAhaaeaacaaIYaaaaaaa@3AD3@  reduces the bias of the EBLUP over the synthetic model. Certainly, a positive random effects variance estimate, results in a ‘conservative’ EBLUP estimator in the sense that it gives a positive weight to the direct survey estimator. Furthermore, it can be viewed as the sum of the regression estimator plus a non-zero term that accounts for part of the ‘model bias’. This feature gives rise to a series of variance estimation methods that yield positive estimates.

In this article, we focus on the adjusted likelihood variance estimators developed by Lahiri and Li (2009) and we propose a MIX variance estimator. Our MIX variance estimator is the combination of a REML estimator and any of the adjusted likelihood methods. We also put forward an estimator of the MSE of the EBLUP under the MIX and investigate the theoretical and finite sample properties of both the MIX variance estimator and MSE estimator.

Morris (2006) and Lahiri and Li (2009) proposed adjusted likelihood variance estimators resulting from optimizing the profile and residual likelihood adjusted with a factor h ( σ v 2 ) , σ v 2 >   0. MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpipeea0xe9Lq=Je9 vqaqFeFr0xbbG8FaYPYRWFb9fi0FXxbbf9Ff0dfrpm0dXdHqVu0=vr 0=vr0=fdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamiAamaabm aabaGaeq4Wdm3aa0baaSqaaiaadAhaaeaacaaIYaaaaaGccaGLOaGa ayzkaaGaaiilaiabeo8aZnaaDaaaleaacaWG2baabaGaaGOmaaaaka baaaaaaaaapeGaeyOpa4JaaiiOaiaaicdapaGaaiOlaaaa@457B@ Li and Lahiri (2011) proposed two methods of variance estimation (the AM.LL and AR.LL methods, associated with the profile and residual likelihoods respectively) that ensure positive estimates with adjustment factor h LL ( σ v 2 ) = σ v 2 . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpipeea0xe9Lq=Je9 vqaqFeFr0xbbG8FaYPYRWFb9fi0FXxbbf9Ff0dfrpm0dXdHqVu0=vr 0=vr0=fdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamiAamaaBa aaleaacaqGmbGaaeitaaqabaGcdaqadaqaaiabeo8aZnaaDaaaleaa caWG2baabaGaaGOmaaaaaOGaayjkaiaawMcaaiabg2da9iabeo8aZn aaDaaaleaacaWG2baabaGaaGOmaaaakiaac6caaaa@4490@ Yoshimori and Lahiri (2014) proposed two other variance estimators (the AM.YL and AR.YL methods) derived from adjusting the the profile and residual likelihoods with factor

h YL ( σ v 2 ) = { arc tan [ i = 1 m σ v 2 / ( σ v 2 + ψ i ) ] } 1 / m MathType@MTEF@5@5@+= feaagKart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpipeea0xe9Lq=Je9 vqaqFeFr0xbbG8FaYPYRWFb9fi0FXxbbf9Ff0dfrpm0dXdHqVu0=vr 0=vr0=fdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamiAamaaBa aaleaacaqGzbGaaeitaaqabaGcdaqadaqaaiabeo8aZnaaDaaaleaa caWG2baabaGaaGOmaaaaaOGaayjkaiaawMcaaiabg2da9maacmaaba GaaeyyaiaabkhacaqGJbGaciiDaiaacggacaGGUbWaamWaaeaadaWc gaqaamaaqahabaGaeq4Wdm3aa0baaSqaaiaadAhaaeaacaaIYaaaaa qaaiaadMgacqGH9aqpcaaIXaaabaGaamyBaaqdcqGHris5aaGcbaWa aeWaaeaacqaHdpWCdaqhaaWcbaGaamODaaqaaiaaikdaaaGccqGHRa WkcqaHipqEdaWgaaWcbaGaamyAaaqabaaakiaawIcacaGLPaaaaaaa caGLBbGaayzxaaaacaGL7bGaayzFaaWaaWbaaSqabeaadaWcgaqaai aaigdaaeaacaWGTbaaaaaaaaa@5E89@

where ψ i MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpipeea0xe9Lq=Je9 vqaqFeFr0xbbG8FaYPYRWFb9fi0FXxbbf9Ff0dfrpm0dXdHqVu0=vr 0=vr0=fdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaeqiYdK3aaS baaSqaaiaadMgaaeqaaaaa@3A14@ is the sampling variance for the i th MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpipeea0xe9Lq=Je9 vqaqFeFr0xbbG8FaYPYRWFb9fi0FXxbbf9Ff0dfrpm0dXdHqVu0=vr 0=vr0=fdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaamyAamaaCa aaleqabaGaaeiDaiaabIgaaaaaaa@3A29@ area. It is well known that the LL estimators are biased, especially for small or moderate number of areas (see Lahiri and Pramanik 2011). The YL method that adjusts the profile likelihood also leads to a biased estimator of σ v 2 . MathType@MTEF@5@5@+= feaagKart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiFu0Je9sqqrpepC0xbbL8F4rqqrpipeea0xe9Lq=Je9 vqaqFeFr0xbbG8FaYPYRWFb9fi0FXxbbf9Ff0dfrpm0dXdHqVu0=vr 0=vr0=fdbaqaaeGacaGaaiaabeqaamaabaabaaGcbaGaeq4Wdm3aa0 baaSqaaiaadAhaaeaacaaIYaaaaOGaaiOlaaaa@3B8F@ However the bias of the variance estimator does not affect the MSE of the EBLUP: the second order asymptotic approximation to the MSE shows that the MSE depends on the asymptotic variance and not on the bias of the variance estimator. However, the bias of the variance estimators affects, the Taylor linearization MSE estimators and it can lead to negatively biased MSE estimators. It is desirable then to investigate alternative positive variance estimators.

The method of combining the AM.LL and the REML variance estimators was first mentioned by Yuan (2009) for the Fay-Herriot model. However, Yuan (2009) did not study its properties, empirically or otherwise. Rubin-Bleuer, Yung and Landry (2010, 2011 and 2012) carried out empirical comparisons of a MIX variance estimator under a time series and cross-sectional area level model and Rubin-Bleuer and You (2012) studied the asymptotic and finite sample properties of the MIX variance estimator for the Fay-Herriot model.

Here we formalize the MIX method for the Fay-Herriot model and prove that the MIX variance estimator is unbiased up to the second order. Furthermore, we propose an MSE estimator of the Taylor linearization type. We also examine the empirical performance of the MIX for a small and moderate number of areas. With respect to MSE estimation, Rubin-Bleuer and You (2012) and Molina, Rao and Datta (2015) each proposed a different ‘split’ MSE estimator under MIX variance estimation. We show that both the Rubin-Bleuer and You (2012) and the Molina et al. (2015) MSE estimators are unbiased up to the second order. These ‘split’ MSE estimators were assigned a rule for populations that yielded zero estimates under REML variance estimation, and another rule for populations that yielded positive estimates under REML variance estimation. Both papers mentioned above showed that for a small number of areas, these ‘split’ estimators behaved well empirically in terms of average relative bias. However this outcome could be misleading, since the MSE estimators are usually negatively biased for populations where the REML variance estimate is zero, and they are positively biased for populations with positive REML estimates: the bias cancels out on average. In view of this issue we propose another MSE estimator, and we compare it to other MSE estimators when conditioned to populations where the REML estimate is zero.

In Section 2, we introduce the Fay-Herriot model, the EBLUP estimator of the small area mean and a second order approximation of the MSE of the EBLUP under the model. In Section 3, we describe the REML estimator and the *.LL and *.YL variance estimators. In Section 4, we present a general MIX variance estimator and we prove that its bias is of the same order as the bias of the REML estimator. We propose an unbiased (up to the second order) estimator of the MSE under the MIX method. In Section 5, we conduct an empirical study to compare the different variance estimators. Note that we defined the MIX variance estimator as a combination between the REML and any of the adjusted likelihood variance estimators, but the MIX variance estimator we chose for this study is the combination of the REML estimator and the AM.LL variance estimator. We selected this combination because Li and Lahiri (2011) reported that the adjusted profile likelihood performed better than adjusted residual likelihood (AR.LL) and because the adjustment factor in the Yoshimori and Lahiri (2014) variance estimators is too close to zero (in log terms), to improve significantly on the REML method. Finally in Section 6, we present the simulation results, analysis and conclusion.

Date modified: