Browse by

1. Introduction

Jiming Jiang, Thuan Nguyen and J. Sunil Rao

Observed best prediction (OBP; Jiang, Nguyen and Rao 2011) is a new method for small area estimation (SAE; e.g., Rao 2003). It is motivated by the fact that the best linear unbiased prediction (BLUP) is a hybrid of best prediction (BP) and maximum likelihood (ML) estimation, while the main interest in SAE is typically a prediction problem. The OBP derives parameter estimation based on a purely predictive consideration, leading to the so-called best predictive estimator (BPE) of the model parameters. The development of the OBP in Jiang et al. (2011) mainly focuses on the Fay-Herriot model (Fay and Herriot 1979). Another important class of SAE models is the nested-error regression (NER) model, introduced by Battese, Harter and Fuller (1988). The NER model may be expressed as

$y_{i j} = {x^{'}}_{i j} β + v_{i} + e_{i j}, (1.1)$

$i = 1, \dots, m, j = 1, \dots, n_{i},$ where the $v_{i} ’ s$ are the area-specific random effects and $e_{i j} ’ s$ are errors which are assumed to be independent and normally distributed with mean zero, $var (v_{i}) = σ_{v}^{2}$ and $var (e_{i j}) = σ_{e}^{2},$ where $σ_{v}^{2}$ and $σ_{e}^{2}$ are unknown. Under the NER model, the small area mean, assuming infinite population, is $θ_{i} = {\bar{X}}^{'}_{i} β + v_{i}$ for the $i^{th}$ small area, where ${\bar{X}}_{i}$ is the population mean of the $x_{i j} ’ s$ (assumed known; e.g., Rao 2003). It is seen that $θ_{i}$ is a (linear) mixed effect. Let $γ = σ_{v}^{2} / σ_{e}^{2} .$ Then, the best predictor (BP) of $θ_{i},$ is obtained by minimizing the model-based mean squared prediction error (MSPE),

$E_{M} {({\overset{⌣}{θ}}_{i} - θ_{i})}^{2}, (1.2)$

where $E_{M}$ denotes expectation under the assumed NER model, and ${\overset{⌣}{θ}}_{i}$ denotes a predictor of $θ_{i} .$ By normal theory (e.g., Jiang 2007, page 237), the BP is given by

${\tilde{θ}}_{i} = E_{M} (θ_{i} | y_{i}) = {\bar{X}}^{'}_{i} β + \frac{n_{i} γ}{1 + n_{i} γ} ({\bar{y}}_{i \cdot} - {\bar{x}}^{'}_{i \cdot} β), (1.3)$

where $y_{i} = {(y_{i j})}_{1 \leq j \leq n_{i}}, β$ and $γ$ are the true parameters, ${\bar{y}}_{i \cdot} = n_{i}^{- 1} \sum_{j = 1}^{n_{i}} y_{i j}$ and ${\bar{x}}_{i \cdot} = n_{i}^{- 1} \sum_{j = 1}^{n_{i}} x_{i j} .$ The traditional best linear unbiased prediction (BLUP) method is based on (1.3) with $β$ replaced by its ML estimator, assuming that $γ$ is known; and the empirical BLUP (EBLUP) is derived from the BLUP with $γ$ replaced by a consistent estimator.

The OBP procedure (Jiang et al. 2011) derives estimators of $β$ and $γ,$ namely the BPE, by minimizing the observed, design-based MSPE, which is completely different from the traditional methods such as maximum likelihood (ML) and restricted maximum likelihood (REML; e.g., Jiang 2007). Throughout this paper, we assume that the samples are drawn via simple random sampling, without replacement, from each small area, which is what the design-based approach is based upon. Write $ψ = {(β^{'}, γ)}^{'} .$ Note that, in practice, the small area populations are finite. Following Jiang et al. (2011), we consider a super-population NER model. Suppose that the subpopulations of responses ${Y_{i k}, k = 1, \dots, N_{i}}$ and auxiliary data ${X_{i k l}, k = 1, \dots, N_{i}}, l = 1, \dots, p$ are realizations from corresponding super-populations that are assumed to satisfy the NER model. It follows that

$Y_{i k} = {X^{'}}_{i k} β + v_{i} + e_{i k}, i = 1, \dots, m, k = 1, \dots, N_{i}, (1.4)$

where $β, v_{i}$ and $e_{i k}$ satisfy the same assumptions as in (1.1). Under the finite-population setting, the true small area mean is $θ_{i} = {\bar{Y}}_{i} = N_{i}^{- 1} \sum_{k = 1}^{N_{i}} Y_{i k}$ (as opposed to $θ_{i} = {\bar{X}}^{'}_{i} β + v_{i}$ under the infinite-population setting) for $1 \leq i \leq m .$ Furthermore, write $r_{i} = n_{i} / N_{i} .$ Then, the finite-population version of the BP (1.3) has the expression (e.g., Rao 2003, Section 7.2.5)

${\tilde{θ}}_{i} = E_{M} (θ_{i} | y_{i}) = {\bar{X}}^{'}_{i} β + {r_{i} + (1 - r_{i}) \frac{n_{i} γ}{1 + n_{i} γ}} ({\bar{y}}_{i \cdot} - {\bar{x}}^{'}_{i \cdot} β), (1.5)$

where $E_{M}$ denotes (conditional) expectation under the assumed super-population NER model, and $β$ and $γ$ are the true parameters. Note that the BP is model-dependent.

In practice, any assumed model is subject to misspecification. Jiang et al. (2011) considers misspecification of the mean function, while assuming that the variance-covariance structure of the data is correctly specified. However, the latter, too, may be misspecified in practice. In this paper, we extend the potential model misspecification to both the mean function and the variance-covariance structure. One possible misspecification of the variance-covariance structure is heteroscedasticity, defined in terms of $var (e_{i j}) = σ_{i}^{2}$ for area $i, 1 \leq i \leq m,$ where the $σ_{i}^{2} ’ s$ are unknown and possibly different. However, in spite of the potential model misspecification, there are reasons that one cannot "abandon� the assumed model, and the model-based BP. First, the assumed model and BP are relatively simple to use, and therefore, attractive to practitioners; in particular, they utilizes simple relationship (linear) between the response and auxiliary variables. For example, in contrast to (1.4), which may subject to misspecification of the mean function, ${X^{'}}_{i k} β,$ one may assume $Y_{i k} = μ_{i k} + v_{i} + e_{i k},$ where the $μ_{i k}$ are completely unspecified, unknown constants. The latter model is almost always correct, but is useless, because it does not utilize any relationship between $Y$ and $X$ at all. In fact, in practice, if auxiliary data are available, it is often "politically incorrect� not to use them. Secondly, even though there is a concern about the model misspecification, it often lacks (statistical) evidence on why something else is more reasonable, or whether a complication is necessary. For example, sometimes there is a concern about the normality assumption, but there is no indication on why an alternative distribution, say, $t_{5},$ is more reasonable. As another example, suppose that one fits a quadratic model and finds that the coefficient of the quadratic term is insignificant. Then, one is not sure whether the complication of quadratic modeling is necessary as opposed to linear modeling. Thus, as far as this paper is concerned, we are not attempting to change the assumed model, or the BP, (1.5), based on the assumed model. In particular, we assume a single parameter, $γ,$ in (1.5) for the ratio $σ_{v}^{2} / σ_{e}^{2},$ rather than considering a heteroscedastic NER model such as in Jiang and Nguyen (2012), and Nandram and Sun (2012). Our goal is to find a better way to estimate the parameters, $ψ,$ under the assumed model that are involved in (1.5), so that the resulting BP, (1.5), is more robust against model misspecifications. We do so by considering an objective MSPE that is not model-dependent, defined as follows. Let $θ = {(θ_{i})}_{1 \leq i \leq m}$ denote the vector of small area means, and $\tilde{θ} = {[{\tilde{θ}}_{i}]}_{1 \leq i \leq m}$ the vector of BPs. Note that ${\tilde{θ}}_{i}$ depends on $ψ,$ that is, ${\tilde{θ}}_{i} = {\tilde{θ}}_{i} (ψ) .$ The design-based MSPE is

$MSPE (\tilde{θ}) = E ({| \tilde{θ} - θ |}^{2}) = \sum_{i = 1}^{m} E {{\tilde{θ}}_{i} (ψ) - θ_{i}}^{2} . (1.6)$

Note that the $E$ in (1.6) is different from the $E_{M}$ in (1.2), (1.3), or (1.5) in that $E$ is completely model-free; namely, the expectation in (1.6) is with respect to the simple random sampling from the areas, which has nothing to do with the assumed model. Jiang et al. (2011) showed that the MSPE in (1.6) has an alternative expression, which is a key idea of the OBP. Namely, we have $MSPE (\tilde{θ}) = E {Q (ψ) + \dots},$ where $\dots$ does not depend on $ψ,$ and

$Q (ψ) = \sum_{i = 1}^{m} {{\tilde{θ}}_{i}^{2} (ψ) - 2 \frac{1 - r_{i}}{1 + n_{i} γ} {\bar{y}}_{i \cdot} {\bar{X}}^{'}_{i} β + b_{i} (γ) {\hat{μ}}_{i}^{2}} = \sum_{i = 1}^{m} Q_{i} . (1.7)$

In (1.7), $ψ$ is considered as a parameter vector, rather than the true parameter vector, $b_{i} (γ) = 1 - 2 a_{i} (γ)$ with $a_{i} (γ) = r_{i} + (1 - r_{i}) n_{i} γ {(1 + n_{i} γ)}^{- 1} .$ Furthermore, ${\hat{μ}}_{i}^{2}$ is a design-unbiased estimator of ${\bar{Y}}_{i}^{2}$ that has the following expression:

${\hat{μ}}_{i}^{2} = \frac{1}{n_{i}} \sum_{j = 1}^{n_{i}} y_{i j}^{2} - \frac{N_{i} - 1}{N_{i} (n_{i} - 1)} \sum_{j = 1}^{n_{i}} {(y_{i j} - {\bar{y}}_{i \cdot})}^{2} . (1.8)$

The BPE of $ψ, \hat{ψ},$ is the minimizer of $Q (ψ)$ with respect to $ψ .$ For the reader's convenience, the derivations of (1.7) and (1.8) are provided in the Appendix. Also note that the BP is based on the (model-based) area-specific MSPE (so it is optimal for every small area, if the assumed model is correct), while the BPE is based on the (design-based) overall MSPE. This is because we do not want the estimator of $ψ$ to be area-dependent. One reason is that area-dependent estimators are often unstable due to the small sample size from the area, while an estimator obtained by utilizing all of the areas, such as the BPE defined in this paper, tends to be much more stable.

The consideration of the design-based MSPE, as we do in this paper, is due to the fact that the design-based MSPE is completely model-free. Note that, in Jiang et al. (2011), where the authors considered the Fay-Herriot model, it is not possible to evaluate the design-based MSPE, because the actual samples from the areas are not available (only summaries of the data are available at the area level). Thus, instead, the authors considered model-based MSPE under the most general, or least restrictive, model, which simply assumes that the mean function is $μ_{i},$ where $μ_{i}$ is completely unknown, for the $i^{th}$ small area. In general, there is a "rule of thumb� on what kind of MSPE one should consider. Essentially, the rule is that one should make the MSPE as model-free as possible, so that it would be objective and (relatively) robust to model-misspecifications.

In Section 2, we first consider a simulated example in which we compare the design-based predictive performance of the OBP with that of the EBLUP. Such comparisons were made in Jiang et al. (2011) under the Fay-Herriot model, but has never been done under the NER model. Furthermore, the simulation setting involves misspecification of both the mean function and the variance function, which, again, has not been considered. The simulation results show that the OBP can outperform the EBLUP not just in the overall design-based MSPE but also in the (design-based) area-specific MSPE for every one of a large number of small areas. This is clearly something that has never been discovered. For example, in Jiang et al. (2011), the OBP is shown to outperform the EBLUP in the overall MSPE but not necessarily for every small area.

An important problem of practical interest is estimation of the area-specific MSPEs, here the design-based MSPEs. In Section 3, we propose a bootstrap estimator for the area-specific MSPE, which has the advantage of simplicity and always being positive. Another simulation study is carried out to evaluate the performance of the proposed MSPE estimator. An application to the Television School and Family Smoking Prevention and Cessation Project (TVSFP) is discussed in Section 4.

Previous | Next

Date modified:: 2015-11-27

Language selection

Search and menus

Search

Publications

Survey Methodology

Browse by

1. Introduction