3. Variance estimation for the one-step calibration estimator
Phillip S. Kott and Dan Liao
Previous | Next
In this section,
we let
be the calibration-weighted estimator for
where
when
is the calibration weight, and
is conveniently defined to be 0 when
The weight-adjustment function
is defined implicitly by equation (2.4), and
is again chosen so that the calibration
equation (2.5) holds for either
or
We propose the
following estimator for the variance
where
is the joint selection probability of
and
under the original sampling design,
when
and 0 otherwise,
and
We will show that
in equation (3.1) can be nearly unbiased in
some sense if either a response model (Section 3.1) or prediction model
holds (Section 3.2).
The variance
estimator in equation (5.2) of Kott (2006) is identical to
in equation (3.1) when
The variance estimator in Kim and
Haziza (2014) is also similar. Their prediction model is more general than the
linear prediction model considered here.
This variance
estimator
presupposes that the original
sampling design is such that each element can only be drawn once. In Section
3.1, we see that when the probabilities of response are independent (Poisson),
then under mild assumptions,
is a nearly unbiased estimator of
the mean squared error of
under the quasi-sampling design
whether or not the prediction model,
holds.
In Section 3.2,
is shown to be a nearly unbiased
estimator for the combined prediction-model and original-sampling-design
variance of
as an estimator for
whether or not the response model
in equation (2.4) holds. Thus,
can be called a “simultaneous
variance estimator”.
3.1 Variance estimation under the response model
For ease of
exposition we will assume that the response model in equation (2.4) with a
finite
holds. Sufficient conditions for
to be a nearly unbiased estimator
for the mean squared error of
(by which the bias converges to 0
as the sample size grows arbitrary large) are
and
is of full rank and is
bounded in probability as the sample size grows arbitrarily large.
From these,
being bounded when
is finite, and the Cauchy-Schwarz
inequality
it is
not hard to see not only that
is a consistent estimator for
but also that
in equation (3.2) (which can be rendered
has a probability limit, call it
whether or not the prediction
model holds. Moreover, both
and
are
Observe that
where
The insertion of the
into the “regression coefficient”
allows us to ignore the contribution to
quasi-design mean squared error of the second term in this sum,
That is because
is true by definition, which implies
is
under our assumptions. Moreover, since
is also
is
which is asymptotically ignorable relative to
the two
components of
With the
contribution of
eliminated from consideration, an
idealized, but not calculable, nearly unbiased estimator for the quasi-design
mean squared error of
is
where the first term on the right estimates the
mean squared error before nonresponse (if any) and the second the added
variance due to nonresponse.
An alternative
nearly unbiased idealized mean squared error estimator, closer to being
calculable, is
where again
when
otherwise. Since the
are independent under the response model with
mean
and variance
when
By contrast, the following holds when
The first summation on the right-hand side of equation (3.7) has terms where
and
terms where
the latter of which causes the second summation in (3.7)
to differ from the second summation on the right-hand side of equation (3.6). Note that the expectation under the response
model of
in the second summation on the right-hand side
of (3.7) is
Finally,
can be replaced by the
asymptotically identical, but computable,
in equation (3.1) since
is bounded for all
under assumptions (3.3) and (3.4),
allowing
and
to be substituted for the unknown
and
respectively (because
and
are
for all
3.2 Variance estimation under the prediction model
Matters are a bit
simpler when we assume a prediction model holds but not necessarily the
response model in equation (2.4). Suppose
whether or not
is
sampled or responds when sampled, and the
are
uncorrelated random variables with variances equal to
where
need not be specified other than
having finite components.
The mean squared
error of
as an estimator for
under that prediction model is
the sum of the prediction variance of
as an estimator for
(see, for example, Kott 2009, page
69), and the squared bias,
the latter being zero when
The combined variance of
as an estimator for
under the prediction model and
original sample design is
where the subscript
denotes that the operation (variance or
expectation) is with respect to the original sampling design. Recall
for
To see that
in equation (3.1) provides a
nearly unbiased estimator for
observe first that
Let
when
and
otherwise. Because the
are uncorrelated, and
it is now not hard to show that
for almost every
pair under the prediction model when
converges to an invertible matrix, and assumptions
(3.3), (3.4), and
hold. Observe that the change from the assumptions
in (3.5) to (3.8) makes the relative bias of
as an estimator for
rather than
Previous | Next