Optimum allocation for a dual-frame telephone survey
4. Comparing the take-all and screening protocolsOptimum allocation for a dual-frame telephone survey
4. Comparing the take-all and screening protocols
We compare the
take-all and screening protocols to establish which is the less costly or more
efficient. Such a comparison can provide practical guidance to planners of
future dual-frame telephone surveys.
4.1 Comparing the
minimum variances and costs
Given either fixed
cost or fixed variance, efficiency can be assessed in terms of the ratio
Values less
than 1.0 favor the screening approach while values greater than 1.0 favor the
take-all approach.
We will illustrate
efficiency using six scenarios regarding a survey of a hypothetical adult
population. For all scenarios, the population size is taken from the March 2010
Current Population Survey (http://www.census.gov/cps/data/) and the population
proportions by telephone status are obtained from the January
June
2010 National Health Interview Survey (Blumberg and Luke 2010). The values are
and
For all scenarios, the aim of the
survey is taken to be the estimation of the total number of adults with a
certain attribute.
The scenario
specific assumptions are set forth in the following table:
Table 4.1
Definition of six scenarios for a hypothetical adult population Table summary
This table displays the results of Definition of six scenarios for a hypothetical adult population. The information is grouped by Scenarios (appearing as row headers), XXXXX (appearing as column headers).
Scenarios
1
0.791
0.750
0.800
0.750
0.784
2
0.759
0.800
0.750
0.750
0.750
3
0.500
0.500
0.500
0.500
0.500
4
0.518
0.600
0.500
0.400
0.469
5
0.209
0.250
0.200
0.250
0.216
6
0.241
0.200
0.250
0.250
0.250
The means
correspond to the proportions of adults with the attribute. Scenario 1
describes a population in which the domain means are similar, with the mean of
the dual-user domain being somewhat larger than the means of the CPO and LLO
populations. Scenario 2 describes a population in which the mean of the LLO
domain is somewhat larger than the means of the other telephone status domains.
Scenario 3 reflects a population in which the means of all telephone status
domains are equal. Scenario 4 reflects a population in which the mean of the
LLO domain is much larger than the mean of the CPO domain. Scenarios 5 and 6
correspond to Scenarios 1 and 2, respectively, using means equal to one minus
the corresponding means. The mean of the CPO domain declines from Scenario
1 to 6.
We selected the
six scenarios to illustrate various circumstances in which the means of CPO,
LLO, and dual-user domains differ. Differences can arise because younger
adults, Hispanics, adults living only with unrelated adult roommates, adults
renting their home, and adults living in poverty tend to be CPO (Blumberg and
Luke 2013). To gain insight into the relative efficiencies of the take-all and
screening designs, planners of future surveys may repeat our calculations for
new scenarios specified by them and tailored to the particulars of their
applications.
We will consider
the six scenarios using three assumed cost structures. The cost structures are
intended to illuminate various circumstances in which the per-unit cost of
screening is high or low relative to the cost of the survey interview, with
Cost Structures 1-3 reflecting increasing relative cost of screening. All cost
components are expressed in interviewing hours:
Cost Structure 1:
and
Cost
Structure 2:
and
Cost
Structure 3:
and
All reflect
circumstances in which the hours per case for a cell-phone interview is about 2
times larger than the hours per case for a landline interview.
Efficiencies
corresponding to the various scenarios for the first cost structure are
illustrated in Figure 4.1. We have prepared similar figures for the second
and third cost structures, but to conserve space we do not present them here.
Description for Figure 4.1
Figure showing a plot of efficiency against mixing parameter for the six scenarios, given the first cost structure. Efficiency is on the y-axis, going from 0.50 to 0.95. Parameter is on the x-axis, going from 0.05 to 0.95. The curves are convex. Efficiency is minimal when for scenarios 3 to 6 and when for scenarios 1 and 2. The maximal efficiency is about 0.92 for scenario 1 and 0.89 for scenario 2, when For scenario 3, it’s about 0.82 when For scenarios 4, 5 and 6, the maximal efficiency is about, respectively, 0.80, 0.79 and 0.75 when
Given Cost
Structure 1, the screening approach achieves the lower variance for the same
fixed cost for all six scenarios. Given Cost Structure 3, in which the per-unit
cost of screening is relatively much higher than in Cost Structure 1, the
take-all approach achieves a smaller variance than the screening approach for
half of the population scenarios. For Cost Structure 2, which entails an
intermediate level of screening cost, the screening approach beats the take-all
approach for all scenarios except for Scenario 1, in which the two approaches
are nearly equally efficient.
The comparison
between the take-all and screening protocols can be understood by examining the
form of efficiency
in (4.1). The unit cost of
screening is embedded only within the term
in the numerator of
Thus, for a given scenario, the
value of
must increase with increasing
screening cost. For smaller screening costs,
may be less than 1.0 in which
case the screening protocol will be preferred, while for larger screening
costs,
may exceed 1.0 in which case the
take-all protocol will be preferred.
It is also of
interest to examine how the efficiency
varies with the domain means
(i.e., the domain proportions), given a fixed cost structure. We see in (4.1)
and in the definitions of the variance components that as long as the domain
and
reasonably together, as they do
in our scenarios, the variation has relatively little or no impact on
and
and
will tend to vary more directly
with
and in turn with the value of the
ratio
in the CPO domain. The smaller
the mean in the CPO domain, the smaller this ratio will be, and in turn the
smaller
will be. Thus, in each of the
structures, we see smaller values of
in Scenarios 5 and 6 than in
Scenarios 1 and 2, and intermediate values of
in Scenarios 3 and 4.
For the take-all
protocol, the optimum
are located at the points at
which the efficiencies reach their maximum values. Table 4.2 reveals the
optimum sample sizes and the optimum parameters
for each scenario and cost
structure, assuming a fixed cost budget of 1,000 interviewing hours. For the
screening protocol, we expect to complete
cell-phone interviews. For all
population scenarios and cost structures studied here, the screening protocol
obtains fewer completed cell-phone interviews than does the take-all protocol. The
latter design uses resources for interviewing dual-user cases in both of the
samples and requires more cell-phone interviews to provide adequate
representation of CPO cases, while the former design can be more efficient
about interviewing CPO cases at the price of using resources to conduct the
requisite screening interviews. The optimum
fall approximately in the range
from 0.4 to 0.6 and the variance under the take-all protocol is fairly flat
within this range. We examine this issue further in Section 4.2.
In summary, one
may conclude from these illustrations that the screening approach is often more
efficient than the take-all approach. As the cost of the screener increases
relative to the cost of the interview, the outcome can tip in favor of the
take-all approach. The take-all approach will be preferred for surveys in which
the cost of the screener is relatively very high; otherwise, the screening
protocol will be preferred. The screening approach will tend to be relatively
more efficient for small values of the CPO domain mean than for large values of
this mean.
Table 4.2
Sample sizes and optimum
for the take-all and screening designs Table summary
This table displays the results of Sample sizes and optimum XXXXX for the take-all and screening designs. The information is grouped by Cost Structure (appearing as row headers), Screening Design and Take-All Design, calculated using XXXXX units of measure (appearing as column headers).
Cost Structure
Screening Design
Take-All Design
Scenario 1
1
494
747
234
0.45
337
331
2
469
641
201
0.45
337
331
3
431
505
159
0.45
337
331
Scenario 2
1
506
728
229
0.45
339
330
2
481
626
197
0.45
339
330
3
443
494
155
0.45
339
330
Scenario 3
1
583
615
193
0.50
344
328
2
559
533
167
0.50
344
328
3
520
425
134
0.50
344
328
Scenario 4
1
605
582
183
0.55
377
312
2
581
506
159
0.55
377
312
3
543
405
127
0.55
377
312
Scenario 5
1
606
581
182
0.55
358
321
2
582
505
159
0.55
358
321
3
544
404
127
0.55
358
321
Scenario 6
1
618
563
177
0.55
354
323
2
594
490
154
0.55
354
323
3
557
393
123
0.55
354
323
4.2 Choosing the
mixing parameter
for the take-all protocol
The optimum
allocation is defined in terms of the mixing parameter, and thus it is
important to consider the choice of this parameter. In the foregoing section,
we saw that variance is likely not very sensitive to the choice of
within a reasonable neighborhood
of optimum
While the actual optimum
will never be known in practical
applications, in this section, we describe a practical method that
statisticians may use to select a reasonable, near-optimum value of
The landline and
cell-phone samples each supply an estimator of the total in the dual-user
domain, and the mixing parameter
is used to combine the two
estimators into one best estimator for this domain. When the estimator of the
dual-user domain derived from the landline sample is the more precise,
should be relatively large, and
conversely, when the estimator from the cell-phone sample is the more precise,
then
should be relatively large. It
makes good statistical sense to consider the value of
that is proportional to the
expected sample size in the dual-user domain, i.e.,
where the optimum allocation is
based on this choice of
Thus,
is a root of the equation
and, in turn,
and
are defined in terms of
From (4.2) it is apparent that
is a function of the
variable
of interest. Use of this
in actual practice could imply a
different sample size and set of survey weights for each variable of interest,
which would be unworkable. To provide a practicable solution, one might
consider use of the
that corresponds to the survey
variable
(the population total
corresponding to this variable is simply the total number of unique units on
the two sampling frames). Given this approach
is a root of the equation
For the cost structures
considered in this section, the corresponding
is 0.52. In Figure 4.1, one can
see that this value is very close to the exact optimum
under the various scenarios, with
little loss in efficiency. Alternatively, one could evaluate (4.2) for a small
set of the most important items in the survey; choose a good compromise value
of
and then define the optimum
allocation in terms of this one compromise value.
Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.
Submission of Manuscripts
Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).
Note of appreciation
Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.
Standards of service to the public
Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.
Copyright
Published by authority of the Minister responsible for Statistics Canada.