Bayesian predictive inference of a proportion under a two-fold small area model with heterogeneous correlations
Section 3. Numerical study and comparisons

Table of contents

In this section, we perform empirical studies to assess the performance of the HeC model that we compare with the HoC model. In Section 3.1, we discuss an illustrative example and, in Section 3.2, we present a simulation study.

3.1 An illustrative example

We use data from the Third Grade US population; see Nandram (2015) for a brief discussion of these data. The dataset, collected in 1999, consists of 2,477 students who participated in the Third International Mathematics and Science Study (TIMSS). Foy, Rust, and Schleicher (1996) described the probability proportional to size (PPS) systematic sampling design used in TIMSS data collection and Caslyn, Gonzales and Frase (1999) gave highlights from TIMSS. Areas are formed crossing four regions (Northeast, South, Central and West) and three communities of the US (village or rural area, outskirts of a town or city and close to the center of a town or city). Thus, there are twelve areas. The binary variable is whether a student’s mathematics score is below average. Clusters are schools while units within the clusters are the students.

To assess the quality of the Bayesian predictive inference, as suggested by a referee, Nandram (2015) took a half sample of the original data, which he called a synthetic sample. The original sample was used as the population, and the half sample was used for analysis, thereby providing a method to assess the predictive power of the models in Nandram (2015). In the current paper, as suggested by a referee, we do not use a half sample and we use the original dataset available to us; see Table 3.1 for the entire dataset which we analyze in this paper. The predictive power of the HeC model is assessed mainly through the simulation study.

Unfortunately, as in many complex surveys, the sample fractions are unknown to secondary data analysts. However, typically for many of these complex surveys, the sample fractions are relatively small. For the TIMSS data we assume that the dataset is a 5% sample of the population. For example, if there are four sampled schools for an area, say $i^{th}$ area $(i =1, \dots, l),$ the total number of clusters, $M_{i}$ is assumed to be 80. If there are 17 observed students within a sampled school, say $j^{th}$ school, the total number of students, $N_{i j} (j =1, \dots, m_{i})$ is assumed to be 340. For the nonsampled schools, $N_{i j} (j = m_{i} + 1, \dots, M_{i})$ is assumed to be the average of the total number of students within the sampled schools for each area. Moreover, there are many schools in which all or many students were either below or above average. In other words, this dataset is far sparse, thereby making direct estimation difficult.

Table 3.1
Number of US students below average in mathematics within schools by area
Table summary
This table displays the results of Number of US students below average in mathematics within schools by area. The information is grouped by Area (appearing as row headers), (s, n), m and Schools (appearing as column headers).
Area	(s, n)	m	Schools
NR	40	4	9	10	11	10	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell
NR	74	This is an empty cell	17	16	21	20	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell
NO	60	9	8	7	12	3	12	8	7	1	2	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell
NO	173	This is an empty cell	20	21	17	19	16	25	22	14	19	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell
NC	135	11	9	20	1	22	20	11	26	10	1	12	3	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell
NC	222	This is an empty cell	15	23	16	25	22	25	27	19	16	22	12	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell
SR	84	8	6	14	14	9	14	10	12	5	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell
SR	140	This is an empty cell	16	21	16	14	23	19	22	9	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell
SO	164	16	14	9	12	10	18	11	3	0	13	9	13	8	11	10	19	4
SO	298	This is an empty cell	19	14	13	18	22	18	21	16	18	15	26	9	19	22	25	23
SC	150	13	16	11	13	6	8	9	13	6	11	15	15	18	9	This is an empty cell	This is an empty cell	This is an empty cell
SC	225	This is an empty cell	16	13	17	16	19	16	18	12	19	16	19	21	23	This is an empty cell	This is an empty cell	This is an empty cell
CR	17	2	7	10	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell
CR	39	This is an empty cell	16	23	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell
CO	59	7	13	11	5	15	3	2	10	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell
CO	140	This is an empty cell	22	18	9	19	24	23	25	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell
CC	145	14	21	1	12	9	12	13	16	13	7	12	7	8	4	10	This is an empty cell	This is an empty cell
CC	259	This is an empty cell	21	26	22	13	16	18	21	18	17	18	17	19	16	17	This is an empty cell	This is an empty cell
WR	54	7	13	11	4	2	7	11	6	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell
WR	118	This is an empty cell	15	19	10	16	16	20	22	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell	This is an empty cell
WO	117	13	8	11	15	9	7	10	1	15	14	9	7	6	5	This is an empty cell	This is an empty cell	This is an empty cell
WO	224	This is an empty cell	13	13	25	16	20	12	20	18	20	17	17	17	16	This is an empty cell	This is an empty cell	This is an empty cell
WC	331	31	9	17	10	12	15	15	8	22	20	7	18	7	13	15	13	8
	This is an empty cell	This is an empty cell	6	8	17	13	9	6	12	7	11	4	9	8	2	3	7	This is an empty cell
	515	This is an empty cell	18	22	10	14	15	15	8	23	22	7	18	10	26	29	13	17
	This is an empty cell	This is an empty cell	16	14	18	15	13	23	21	26	16	11	14	14	17	15	15	This is an empty cell
Note: (s, n) represent s (top), the number of students scoring below average and n (bottom) the sample size [e.g., NR has 74 students sampled from m = 4 schools with a total of 40 students scoring below average]. The areas formed by crossing region (N: north, S: south, C: central, W: west) and community (R: rural, O: outskirt of a town or city, C: town or city).

We perform three goodness-of-fit procedures, the deviance information criterion (DIC), the Bayesian posterior predictive $p -$ value (BPP) and the log pseudo marginal likelihood (LPML), which is a measure based on the same cross-validation (leave-one-out) procedure. We can assess the overall fit of the models with these procedures.

In the HeC model, $s_{i j} | p_{i j} \overset{ind}{\sim} Binomial (n_{i j}, p_{i j}), p_{i j} \overset{ind}{\sim} Beta (μ_{i} (1 - ρ_{i}) / ρ_{i}, (1 - μ_{i}) (1 - ρ_{i}) / ρ_{i}) .$ Thus, by integrating out the $p_{i j},$ we can obtain the following beta-binomial probability mass function,

$f (s | μ, ρ) = \prod_{i =1}^{l} \prod_{j =1}^{m_{i}} (\begin{array}{l} \begin{array}{l} n_{i j} \\ s_{i j} \end{array} \end{array}) \frac{B (s_{i j} + μ_{i} (1 - ρ_{i}) / ρ_{i}, n_{i j} - s_{i j} + (1 - μ_{i}) (1 - ρ_{i}) / ρ_{i})}{B (μ_{i} (1 - ρ_{i}) / ρ_{i}, (1 - μ_{i}) (1 - ρ_{i}) / ρ_{i})} .$

It is also true that $E (s_{i j} | μ_{i}, ρ_{i}) = n_{i j} μ_{i}$ and $Var (s_{i j} | μ_{i}, ρ_{i}) = n_{i j} {1 + (n_{i j} - 1) ρ_{i}} μ_{i} (1 - μ_{i}) .$

Let $μ_{i}^{(h)}$ and $ρ_{i}^{(h)}$ $(i =1, \dots, l, h =1, \dots, H)$ denote the iterates from the blocked Gibbs sampler. Let ${\bar{μ}}_{i} = \sum_{h =1}^{H} μ_{i}^{(h)} / H (i =1, \dots, l)$ and ${\bar{ρ}}_{i} = \sum_{h =1}^{H} ρ_{i}^{(h)} / H .$ Letting $D (\bar{μ}, \bar{ρ}) = - 2 \log {p (s | \bar{μ}, \bar{ρ})}$ and $\bar{D} = - 2 \sum_{h =1}^{H} log {p (s | μ^{(h)}, ρ^{(h)})} / H,$ deviance information criterion is given by

$DIC =2 \bar{D} - D (\bar{μ}, \bar{ρ}) .$

Models with smaller DIC are more preferred over those with larger DIC. However, since DIC tends to select over-fitted models, Nandram (2015) described the Bayesian predictive $p -$ values as a backup. For the HeC model, the discrepancy function is

$T (s; μ, ρ) = \sum_{i =1}^{l} \sum_{j =1}^{m_{i}} \frac{{s_{i j} - E (s_{i j} | μ_{i}, ρ_{i})}^{2}}{Var (s_{i j} | μ_{i}, ρ_{i})} .$

Let $s^{(rep)}$ denote repeated (rep) samples from the posterior predictive distribution of $s .$ Then the BPP is $p {T (s^{(rep)}; μ, ρ) \geq T (s^{(obs)}; μ, ρ) | s},$ which is calculated over its corresponding iterates $(μ^{(h)} , ρ^{(h)}), h = 1, \dots, H .$ If the value of this probability is close to 0 or 1, it indicates poor fit of the model. In fact, models with BPPs in (0.05, 0.95) are considered reasonable.

In addition to these quantities, we can evaluate the goodness-of-fit of models with another measure, the LPML which is a summary statistic of the conditional predictive ordinate (CPO) values, and it is based on a cross validation. Unlike the DIC, larger values of LPML indicate better fitting models (e.g., Geisser and Eddy 1979).

For the HeC model, the CPO can be estimated by

${\hat{CPO}}_{i j} = {[\frac{1}{H} \sum_{h = 1}^{H} \frac{1}{f (s_{i j} | p_{i j}^{(h)})}]}^{- 1} , j = 1, \dots, m_{i}, i = 1, \dots, l,$

where $p_{i j}^{(h)}$ is the samples from $p_{i j} | s_{i j}, μ_{i}, ρ_{i}$ and $s_{i j} | p_{i j} \overset{iid}{\sim} Binomial (n_{i j}, p_{i j}) .$ Note that for each $(i, j),$ ${\hat{CPO}}_{i j}$ is the harmonic mean of the likelihoods $f (s_{i j} | p_{i j}^{(h)}), h =1, \dots, H .$ Then, the LPML is

$LPML = \sum_{i =1}^{l} \sum_{j =1}^{m_{i}} \log ({\hat{CPO}}_{i j}) .$

These three model evaluation measures have similar forms under the HoC model. For the HoC (HeC) model, DIC = 774.421 (773.173), BPP = 0.349 (0.408), LPML = -352.064 (-346.171), thereby indicating that the HeC model gives a better fit. At a finer level, we also looked at the individual CPO values from the two models for each school. In Figure 3.1 we compare the CPOs from the HeC and the HoC models, and we found that generally CPO values for the HeC model are higher that those of HoC model. In fact, under the HoC (HeC) model we found that the percent of the CPOs less than 0.025 is 3.70% (2.96%) and percent of the CPOs less than 0.014 is 0.74% (0.00%). These results do not show any indication of serious departure from model assumptions; see Ntzoufras (2009). Therefore, these measures give prima facie evidence that the HeC model fits the TIMSS data somewhat better than the HoC model.

Figure 3.1

Description for Figure 3.1

Figure made of two scatter plots. The first compares the individual CPO values from the HeC and HoC models for each school. The school is on the y-axis, going from 0 to 150. CPOs are on the x-axis, going from 0.00 to 0.35. Most of the data points are concentrated between CPO values of 0.00 and 0.15 for both models.

The second graph compares the CPOs for HeC and HoC models. CPOs for HeC model are on the y-axis, ranging from 0.00 to 0.30. CPOs for HoC model are on the x-axis, ranging from 0.0 to 0.25. Generally CPO values for the HeC model are higher than those of HoC model.

Now, consider inference about $θ$ and $γ .$ First, consider $θ .$ Under the HoC model, the posterior mean (PM) is 0.519, posterior standard deviation (PSD) is 0.068 and 95% credible interval (Cre) is (0.390, 0.639). Under the HeC model PM = 0.515, PSD = 0.065 and 95% Cre is (0.383, 0.639). Second, consider $γ .$ Under the HoC model PM = 0.207, PSD = 0.011, and 95% Cre is (0.190, 0.224). Under the HeC model $γ$ PM = 0.208, PSD = 0.011 and 95% Cre is (0.190, 0.225). Thus, it is good that inference about $θ$ and $γ$ are very close for the two competitors (HoC and HeC models).

In Table 3.2, we present posterior inference about the finite population proportions for mathematics scores by areas. There are differences between the posterior means under the HoC and HeC models. Most of them are small but there are a few large differences. For NC, SR and CR, we have 0.560 (0.543), 0.568 (0.584) and 0.465 (0.445) under the HoC (HeC) model, respectively. The posterior standard deviations are also close but there are a few moderately large differences (e.g., for NR we have 0.113 under the HoC model and 0.077 under the HeC model). These differences are reflected in the credible and highest posterior density (HPD) intervals.

Table 3.2
Comparison of posterior inference from the two-fold models with homogeneous correlation (HoC) and heterogeneous correlations (HeC) for the finite population proportions for US students below average in mathematics by area
Table summary
This table displays the results of Comparison of posterior inference from the two-fold models with homogeneous correlation (HoC) and heterogeneous correlations (HeC) for the finite population proportions for US students below average in mathematics by area. The information is grouped by Area (appearing as row headers), HoC Model and HeC Model (appearing as column headers).
Area	HoC Model				HeC Model
Area	PM	PSD	95% Cre	95% HPD	PM	PSD	95% Cre	95% HPD
NR	0.522	0.113	(0.299, 0.735)	(0.310, 0.741)	0.525	0.077	(0.363, 0.662)	(0.361, 0.658)
NO	0.365	0.075	(0.227, 0.524)	(0.227, 0.520)	0.359	0.072	(0.228, 0.511)	(0.236, 0.516)
NC	0.560	0.070	(0.420, 0.701)	(0.408, 0.680)	0.543	0.082	(0.370, 0.695)	(0.396, 0.710)
SR	0.568	0.080	(0.405, 0.725)	(0.424, 0.731)	0.584	0.062	(0.454, 0.699)	(0.456, 0.699)
SO	0.537	0.058	(0.423, 0.648)	(0.417, 0.639)	0.537	0.063	(0.409, 0.655)	(0.408, 0.653)
SC	0.646	0.064	(0.552, 0.766)	(0.522, 0.766)	0.654	0.059	(0.521, 0.763)	(0.544, 0.774)
CR	0.465	0.137	(0.195, 0.719)	(0.185, 0.709)	0.445	0.125	(0.212, 0.716)	(0.199, 0.700)
CO	0.437	0.085	(0.279, 0.603)	(0.276, 0.596)	0.439	0.091	(0.257, 0.620)	(0.265, 0.620)
CC	0.549	0.064	(0.415, 0.671)	(0.423, 0.672)	0.550	0.066	(0.414, 0.681)	(0.422, 0.685)
WR	0.461	0.086	(0.297, 0.629)	(0.295, 0.626)	0.460	0.085	(0.289, 0.626)	(0.276, 0.611)
WO	0.516	0.066	(0.384, 0.643)	(0.387, 0.644)	0.516	0.058	(0.401, 0.626)	(0.409, 0.633)
WC	0.670	0.042	(0.581, 0.748)	(0.586, 0.749)	0.662	0.047	(0.569, 0.748)	(0.568, 0.746)
Note: PM is the posterior mean, PSD is the posterior standard deviation, Cre is the equal-tail credible interval, and HPD is highest posterior density interval.

Table 3.3 shows summaries of the PM, PSD and 95% HPD for intracluster correlations under the HeC model. We can see that the intracluster correlations vary over the areas. The largest estimate is 0.337 for NC and the smallest one is 0.073 for SR. Both areas have a few large difference between the posterior means under the HoC and HeC models. The 95% HPD interval for the common correlation in the HoC model is (0.160, 0.260) and this interval is contained by all the intervals except for NR, NC, SR and WC. Thus, it is reasonable to study the HeC model.

Table 3.3
Posterior summaries for the intracluster correlations of the two-fold models with heterogeneous correlations for US students below average in mathematics by area
Table summary
This table displays the results of Posterior summaries for the intracluster correlations of the two-fold models with heterogeneous correlations for US students below average in mathematics by area. The information is grouped by Area (appearing as row headers), PM , PSD , 95% Cre and 95% HPD (appearing as column headers).
Area	PM	PSD	95% Cre	95% HPD
NR	0.076	0.084	(0.002, 0.301)	(0.001, 0.251)
NO	0.184	0.087	(0.053, 0.380)	(0.042, 0.358)
NC	0.337	0.087	(0.190, 0.520)	(0.184, 0.513)
SR	0.073	0.067	(0.003, 0.252)	(0.001, 0.216)
SO	0.237	0.075	(0.113, 0.393)	(0.110, 0.387)
SC	0.176	0.079	(0.055, 0.356)	(0.048, 0.329)
CR	0.149	0.147	(0.003, 0.523)	(0.001, 0.445)
CO	0.233	0.103	(0.079, 0.486)	(0.050, 0.434)
CC	0.235	0.077	(0.105, 0.388)	(0.099, 0.381)
WR	0.181	0.099	(0.033, 0.413)	(0.021, 0.378)
WO	0.181	0.075	(0.059, 0.362)	(0.048, 0.327)
WC	0.301	0.063	(0.191, 0.437)	(0.188, 0.434)
Note: Using the two-fold model with homogeneous correlation, PM = 0.211, PSD = 0.026, 95% Cre = (0.162, 0.266), and 95% HPD = (0.160, 0.260). PM is the posterior mean, PSD is the posterior standard deviation, Cre is the equal-tail credible interval, and HPD is highest posterior density interval.

In Figure 3.2, we compare the posterior densities of the intracluster correlations (twelve correlations) from the HeC model and the HoC model (one correlation). The distributions under the HeC model are more variable and are mostly to the left or right of those under the HoC model with not much overlap for some areas (e.g., NR, NC and SR).

Figure 3.2

Description for Figure 3.2

Figure made of twelve graphs presenting the posterior densities of intracluster correlations for mathematics scores by areas (NR, NO, NC, SR, SO, SC, CR, CO, CC, WR, WO and WC) for HeC and HoC models. For each graph, density is on the y-axis, ranging from 0 to 15 and correlations are on the x-axis, ranging from 0.0 to 0.8. The distributions under the HeC model are more variable and are mostly to the left or right of those under the HoC model. Areas NR, NC and SR show a weak overlap between the two distributions. Areas NO, SC, CR, WR, WO and WC show an overlap a little bit bigger. Finally, densities for areas SO, CO and CC overlap more, but not completely.

In Figures 3.3, 3.4 and 3.5, we compare the posterior density plots of the finite population proportions for the mathematics score and all areas for the two models. There are noticeable differences between the HoC and HeC models (e.g., areas NR, NC, SR, CR and WC).

Figure 3.3

Description for Figure 3.3

Figure made of four graphs presenting the posterior densities of finite population proportions for mathematics scores by areas (NR, NO, NC and SR) for HeC and HoC models. For each graph, density is on the y-axis, ranging from 0 to 6 and proportions are on the x-axis, ranging from 0.0 to 0.8. There are noticeable differences between the HoC and HeC models for areas NR, NC and SR. The distributions are closer for area NO.

Figure 3.4

Description for Figure 3.4

Figure made of four graphs presenting the posterior densities of finite population proportions for mathematics scores by areas (SO, SC, CR and CO) for HeC and HoC models. For each graph, density is on the y-axis, ranging from 0 to 6 and proportions are on the x-axis, ranging from 0.0 to 0.8. There are noticeable differences between the HoC and HeC models for area CR. The distributions are closer for areas SO, SC and CO.

Figure 3.5

Description for Figure 3.5

Figure made of four graphs presenting the posterior densities of finite population proportions for mathematics scores by areas (CC, WR, WO and WC) for HeC and HoC models. For each graph, density is on the y-axis, ranging from 0 to 6 for CC and WR and from 0 to 8 for WO and WC and proportions are on the x-axis, ranging from 0.0 to 0.8. There are noticeable differences between the HoC and HeC models for area WC. The distributions are closer for areas CC, WR and WO.

3.2 Simulation study

In order to further assess the performance of the HeC model and to compare it to the HoC model, we perform a simulation study. Here we use two factors, each at three levels, to get nine design points.

We have set 100 as the number of clusters (schools) in each area and 15 as the number of individuals (students) within each cluster. In other words, we take $N_{i j} =15, j =1, \dots, M_{i}, M_{i} =100, i =1, \dots, l$ where $l =12.$ Let $a$ denote a vector of posterior means and $b$ denote the vector of posterior standard deviations corresponding to the $μ_{i}$ or the $ρ_{i} .$ Specifically, for the $ρ_{i}$ we use $a_{1}$ and $b_{1}$ and for the $μ_{i}$ we use $a_{2}$ and $b_{2} .$ When we simulate data from the HeC model, the levels of the $ρ_{i}$ are $(1 : a_{1} - 0.5 b_{1}; 2 : a_{1}; 3 : a_{1} + 0.5 b_{1})$ and the levels of the $μ_{i}$ are $(1 : a_{2} - 0.5 b_{2}; 2 : a_{2}; 3 : a_{2} + 0.5 b_{2}) .$ For the twelve areas $a_{1}$ takes values 0.09, 0.19, 0.32, 0.08, 0.22, 0.18, 0.15, 0.22, 0.23, 0.17, 0.18, 0.30; $b_{1}$ 0.08, 0.09, 0.08, 0.06, 0.07, 0.08, 0.13, 0.09, 0.07, 0.09, 0.07, 0.06; $a_{2}$ 0.53, 0.37, 0.54, 0.58, 0.54, 0.65, 0.46, 0.44, 0.55, 0.46, 0.52, 0.66; and $b_{2}$ 0.08, 0.08, 0.08, 0.06, 0.06, 0.06, 0.12, 0.09, 0.07, 0.08, 0.06, 0.05.

We also take a simple random sample of five clusters among the 100 population clusters, and a simple random sample of ten individuals from each sampled cluster (i.e., $m_{i} =5$ and $n_{i j} =10) .$ These numbers are much smaller than those of data used in Section 3.1, which makes inference a little more challenging (Nandram 2015). Note that the dataset has about 7% of the sampled clusters where all students were either below or above average. We call this quantity the percent of sparseness. The setting of this simulation study also leads to even sparser data. For nine design points, all the average percents of sparseness are greater than 7% and most are around 10%. Figure 3.6 shows the histograms of sparseness percents for each design point.

Figure 3.6

Description for Figure 3.6

Figure made of nine graphs presenting the histograms of the percent of sparseness when data are drawn from the HeC model by design point $[(i, j) : i, j = 1, 2, 3]$ in which the first factor corresponds to $ρ$ and the second factor to $μ .$ For each graph, the frequency is on the y-axis ranging from 0 to 200 and the percent of sparseness is on the x-axis ranging from 0.00 to 0.30 when $μ = 3$ and from 0.00 to 0.20 otherwise. For the three graphs where $μ = 1,$ the percent of sparseness peaks at about 10%, with a higher frequency below than above 10%. For the three graphs where $μ = 2,$ the percent of sparseness peaks at about 10%, with a higher frequency above than below 10%. For the three graphs where $μ = 3,$ the percent of sparseness peaks at about 10%, with a very low frequency for percent lower than 10%.

We consider two scenarios. In the first scenario, we generate data from the HeC model and fit both models, and in the second scenario, we generate data from the HoC model and fit both models. When data are simulated from the HeC model, we have nine design points $[(1,1),(1,2),(1,3), \dots,(3,1),(3,2),(3,3)],$ the first factor corresponds to the $ρ_{i} .$ When we simulate data from the HoC model, we have three design points $(1 : a_{2} - 0.5 b_{2}; 2 : a_{2}; 3 : a_{2} + 0.5 b_{2})$ for the three levels for the $μ_{i};$ $ρ$ is kept fixed at its posterior mean.

In the first scenario, at each design point we simulate binary data from the HeC model,

$\begin{array}{l} p_{i j} | μ_{i}, ρ_{i} & \overset{ind}{\sim} Beta [μ_{i} \frac{1 - ρ_{i}}{ρ_{i}}, (1 - μ_{i}) \frac{1 - ρ_{i}}{ρ_{i}}], \\ y_{i j k} | p_{i j} & \overset{ind}{\sim} Bernoulli (p_{i j}), k =1, \dots, N_{i j} . \end{array}$

So we have the true values of $P_{i} = \sum_{j =1}^{M_{i}} \sum_{k =1}^{N_{i j}} y_{i j k} / \sum_{j =1}^{M_{i}} N_{i j}$ for $i =1, \dots, l .$ We take 1,000 samples at each of the nine design points. For each sample we perform the blocked griddy Gibbs sampler in the same manner as for the data.

Like Nandram (2015), we calculate ${AB}_{i h} = | {PM}_{i h} - P_{i h} |, {RAB}_{i h} = {AB}_{i h} / P_{i h}$ and ${RPMSE}_{i h} = \sqrt{{PSD}_{i h}^{2} + {AB}_{i h}^{2}}$ to study the frequentist properties of our procedure $(i =1, \dots, l, h =1, \dots, 1,000) .$ We also obtain the 95% credible interval and HPD interval for each of the 1,000 simulated runs, and we study the width $W_{i h}$ and the credible incidence $I_{i h} .$ If the 95% credible (or HPD) interval of $h^{th}$ run contains the true value $P_{i},$ $I_{i h}$ is equal to one, otherwise it is equal to zero. Thus, the estimated probability content of the 95% credible interval for the $i^{th}$ area is $C_{i} = \sum_{h =1}^{1,000} I_{i h} / 1,000 .$

Table 3.4 shows comparison of the HoC and HeC models. Under the HeC model the coverages are much higher than those under the HoC model. Note that the coverages of HPD intervals for the HeC model are much closer to the nominal value of 95% and they are conservative. However, the 95% credible and HPD intervals are wider than those from the HoC model. These effects are much larger as $ρ$ becomes larger. All measures AB, RAB and RPMSE under the HeC model are smaller than those under the HoC model. Thus, based on these measures, the HeC model is preferred over the HoC model.

Table 3.4
Simulation under the HeC model: Comparison of the HeC and HoC models using mean coverage and widths of 95% credible intervals and absolute bias, relative absolute bias and root posterior mean squared error for finite poulation proportions by design point
Table summary
This table displays the results of Simulation under the HeC model: Comparison of the HeC and HoC models using mean coverage and widths of 95% credible intervals and absolute bias. The information is grouped by Design Point (appearing as row headers), Model , C-Cre , W-Cre , C-HPD , W-HPD, AB , AB and RPMSE (appearing as column headers).
Design Point	Model	C-Cre	W-Cre	C-HPD	W-HPD	AB	RAB	RPMSE
(1,1)	HeC	0.989	0.620	0.961	0.603	0.112	0.227	0.206
(1,1)	HoC	0.930	0.555	0.893	0.541	0.130	0.266	0.207
(1,2)	HeC	0.984	0.622	0.960	0.603	0.112	0.227	0.206
(1,2)	HoC	0.926	0.558	0.889	0.545	0.132	0.249	0.209
(1,3)	HeC	0.980	0.623	0.955	0.608	0 .120	0.211	0.210
(1,3)	HoC	0.923	0.558	0.892	0.546	0.134	0.236	0.212
(2,1)	HeC	0.982	0.621	0.953	0.603	0.119	0.242	0.212
(2,1)	HoC	0.922	0.564	0.879	0.549	0.137	0.281	0.215
(2,2)	HeC	0.980	0.625	0.952	0.609	0.122	0.228	0.214
(2,2)	HoC	0.918	0.566	0.879	0.552	0.139	0.264	0.217
(2,3)	HeC	0.981	0.628	0.956	0.611	0.121	0.211	0.214
(2,3)	HoC	0.930	0.570	0.895	0.556	0.135	0.239	0.214
(3,1)	HeC	0.982	0.627	0.949	0.608	0.121	0.245	0.215
(3,1)	HoC	0.934	0.583	0.892	0.566	0.136	0.278	0.218
(3,2)	HeC	0.980	0.628	0.947	0.610	0.123	0.242	0.217
(3,2)	HoC	0.928	0.583	0.885	0.566	0.138	0.274	0.220
(3,3)	HeC	0.976	0.632	0.951	0.614	0.124	0.218	0.218
(3,3)	HoC	0.928	0.581	0.889	0.565	0.139	0.246	0.220
Note: In the design point [(i, j): i, j = 1, 2, 3], the first factor corresponds to $ρ$ and the second factor to $μ .$ C-Cre and C-HPD are the probability contents of a credible interval and a HPD interval; W-Cre and W-HPD are the widths of a credible interval and a HPD interval. AB, RAB and RPMSE are the absolute bias, relative absolute bias and root posterior mean squared error.

In Table 3.5 we compare summaries of DIC, BPP and LPML. All the DICs under the HeC model are smaller than the corresponding ones under the HoC model and all the LPMLs under the HeC model are larger than those under the HoC model. Under the HoC model, all the BPPs vary in (0.06, 0.09) but under the HeC model they vary in (0.2, 0.4). Again, these measures show that the HeC model is superior to the HoC model.

In a similar manner, for the second scenario we generate binary data from

$\begin{array}{l} p_{i j} | μ_{i}, ρ & \overset{ind}{\sim} Beta [μ_{i} \frac{1 - ρ}{ρ}, (1 - μ_{i}) \frac{1 - ρ}{ρ}], \\ y_{i j k} | p_{i j} & \overset{ind}{\sim} Bernoulli (p_{i j}), k =1, \dots, N_{i j} . \end{array}$

In Table 3.6 we present comparison of the HoC and HeC models. Here AB, RAB and RPMSE are only slightly smaller under the HoC model. The coverages of the credible and HPD intervals under the HeC model are closer to the nominal value of 95%, while those under the HoC model are smaller. Table 3.7 shows summaries of DIC, BPP and LPML. All the DICs under the HeC model are smaller than those under the HeC model, while the BPPs and LPMLs are similar for the two models, with those under the HoC model being slightly better.

Table 3.5
Simulation under the HeC model: Comparison of the HeC and HoC models using the deviance information criterion (DIC), the Bayesian predictive p-value (BPP) and the log pseudo marginal likelihood (LPML) by design point
Table summary
This table displays the results of Simulation under the HeC model: Comparison of the HeC and HoC models using the deviance information criterion (DIC). The information is grouped by Design Point (appearing as row headers), HoC Model and HeC Model (appearing as column headers).
Design Point	HoC Model			HeC Model
Design Point	DIC	BPP	LPML	DIC	BPP	LPML
(1,1)	419.275	0.090	-285.452	402.044	0.429	-267.990
(1,2)	418.351	0.091	-286.250	400.647	0.439	-266.377
(1,3)	416.784	0.088	-286.290	400.414	0.446	-267.203
(2,1)	436.980	0.067	-307.028	416.264	0.300	-292.756
(2,2)	437.306	0.062	-308.816	414.955	0.318	-292.404
(2,3)	430.531	0.080	-302.258	410.436	0.351	-285.206
(3,1)	441.204	0.090	-316.126	424.010	0.227	-308.825
(3,2)	442.165	0.083	-318.223	424.363	0.235	-309.815
(3,3)	438.305	0.071	-315.159	418.827	0.260	-306.619
Note: In the design point [(i, j): i, j = 1, 2, 3], the first factor corresponds to $ρ$ and the second factor to $μ .$ DIC, BPP and LPML are summarized over the 1,000 simulation runs.

Table 3.6
Simulation under the HoC model: Comparison of the HeC and HoC models using mean coverage and widths of 95% credible intervals and absolute bias, relative absolute bias and root posterior mean squared error for finite poulation proportions by design point
Table summary
This table displays the results of Simulation under the HoC model: Comparison of the HeC and HoC models using mean coverage and widths of 95% credible intervals and absolute bias. The information is grouped by Design Point (appearing as row headers), Model , C-Cre , W-Cre , C-HPD , W-HPD, AB , RAB and RPMSE (appearing as column headers).
Design Point	Model	C-Cre	W-Cre	C-HPD	W-HPD	AB	RAB	RPMSE
1	HeC	0.985	0.627	0.969	0.608	0.117	0.242	0.212
1	HoC	0.944	0.575	0.919	0.559	0.107	0.240	0.210
2	HeC	0.988	0.634	0.952	0.616	0.122	0.234	0.216
2	HoC	0.938	0.585	0.917	0.568	0.115	0.214	0.211
3	HeC	0.977	0.628	0.940	0.611	0.126	0.222	0.218
3	HoC	0.933	0.572	0.908	0.556	0.113	0.202	0.208
Note: The design point [i: i = 1, 2, 3], corresponds to $μ$ with $ρ$ held fixed at its posterior mean. C-Cre and C-HPD are the probability contents of a credible interval and a HPD interval; W-Cre and W-HPD are the widths of a credible interval and a HPD interval. AB, RAB and RPMSE are the absolute bias, relative absolute bias and root posterior mean squared error.

Table 3.7
Simulation under the HoC model: Comparison of the HeC and HoC models using the deviance information criterion (DIC), the Bayesian predictive p-value (BPP) and the log pseudo marginal likelihood (LPML) by design point
Table summary
This table displays the results of Simulation under the HoC model: Comparison of the HeC and HoC models using the deviance information criterion (DIC). The information is grouped by Design Point (appearing as row headers), HoC Model and HeC Model (appearing as column headers).
Design Point	HoC Model			HeC Model
Design Point	DIC	BPP	LPML	DIC	BPP	LPML
1	428.647	0.308	-300.526	416.626	0.302	-303.001
2	430.113	0.371	-295.191	417.557	0.317	-296.531
3	429.598	0.379	-295.613	414.877	0.335	-297.250
Note: The design point [i: i = 1, 2, 3], corresponds to $μ$ with $ρ$ held fixed at its posterior mean. DIC, BPP and LPML are summarized over the 1,000 simulation runs.

Thus, when data actually come from the HeC model, there are some important differences among the two models, with the HeC model being preferred. However, when data actually come from the HoC model, there are minor differences between the two models. Of course, the HeC model (unequal correlations) has more parameters than the HoC model (one correlation).

ISSN : 1492-0921

Editorial policy

Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.

Submission of Manuscripts

Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).

Note of appreciation

Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.

Standards of service to the public

Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.

Copyright

Published by authority of the Minister responsible for Statistics Canada.

Use of this publication is governed by the Statistics Canada Open Licence Agreement.

Catalogue No. 12-001-X

Frequency: semi-annual

Ottawa

Date modified:: 2017-06-22

Language selection

Search and menus

Search

Bayesian predictive inference of a proportion under a two-fold small area model with heterogeneous correlations
Section 3. Numerical study and comparisons

3.1 An illustrative example

3.2 Simulation study

Bayesian predictive inference of a proportion under a two-fold small area model with heterogeneous correlations Section 3. Numerical study and comparisons

3.1 An illustrative example

3.2 Simulation study

Editorial policy

Submission of Manuscripts

Note of appreciation

Standards of service to the public

Copyright

Bayesian predictive inference of a proportion under a two-fold small area model with heterogeneous correlations
Section 3. Numerical study and comparisons