Optimum allocation for a dual-frame telephone survey 4. Comparing the take-all and screening protocolsOptimum allocation for a dual-frame telephone survey 4. Comparing the take-all and screening protocols

We compare the take-all and screening protocols to establish which is the less costly or more efficient. Such a comparison can provide practical guidance to planners of future dual-frame telephone surveys.

4.1 Comparing the minimum variances and costs

Given either fixed cost or fixed variance, efficiency can be assessed in terms of the ratio

$E = \frac{min [Var {\hat{Y}}]}{min [Var {\ddot{Y}}]} = \frac{min [C_{S C}]}{min [C_{T A}]} = \frac{{(\sqrt{c_{A}} R_{A} + \sqrt{{c^{‴}}_{B}} R_{B})}^{2}}{{(\sqrt{c_{A}} Q_{A} + \sqrt{c_{B}} Q_{B})}^{2}} . (4.1)$

Values less than 1.0 favor the screening approach while values greater than 1.0 favor the take-all approach.

We will illustrate efficiency using six scenarios regarding a survey of a hypothetical adult population. For all scenarios, the population size is taken from the March 2010 Current Population Survey (http://www.census.gov/cps/data/) and the population proportions by telephone status are obtained from the January $-$ June 2010 National Health Interview Survey (Blumberg and Luke 2010). The values are $N_{A} = 83,451,980,$ $N_{a} = 15,162,402,$ $N_{a b} = 68,289,578,$ $N_{b} = 31,265,108,$ $N_{B} = 99,554,686,$ $α = 0.818,$ and $β = 0.686.$ For all scenarios, the aim of the survey is taken to be the estimation of the total number of adults with a certain attribute.

The scenario specific assumptions are set forth in the following table:

Table 4.1
Definition of six scenarios for a hypothetical adult population
Table summary
This table displays the results of Definition of six scenarios for a hypothetical adult population. The information is grouped by Scenarios (appearing as row headers), XXXXX (appearing as column headers).
Scenarios	${\bar{Y}}_{A}$	${\bar{Y}}_{a}$	${\bar{Y}}_{a b}$	${\bar{Y}}_{b}$	${\bar{Y}}_{B}$
1	0.791	0.750	0.800	0.750	0.784
2	0.759	0.800	0.750	0.750	0.750
3	0.500	0.500	0.500	0.500	0.500
4	0.518	0.600	0.500	0.400	0.469
5	0.209	0.250	0.200	0.250	0.216
6	0.241	0.200	0.250	0.250	0.250

The means correspond to the proportions of adults with the attribute. Scenario 1 describes a population in which the domain means are similar, with the mean of the dual-user domain being somewhat larger than the means of the CPO and LLO populations. Scenario 2 describes a population in which the mean of the LLO domain is somewhat larger than the means of the other telephone status domains. Scenario 3 reflects a population in which the means of all telephone status domains are equal. Scenario 4 reflects a population in which the mean of the LLO domain is much larger than the mean of the CPO domain. Scenarios 5 and 6 correspond to Scenarios 1 and 2, respectively, using means equal to one minus the corresponding means. The mean of the CPO domain declines from Scenario 1 to 6.

We selected the six scenarios to illustrate various circumstances in which the means of CPO, LLO, and dual-user domains differ. Differences can arise because younger adults, Hispanics, adults living only with unrelated adult roommates, adults renting their home, and adults living in poverty tend to be CPO (Blumberg and Luke 2013). To gain insight into the relative efficiencies of the take-all and screening designs, planners of future surveys may repeat our calculations for new scenarios specified by them and tailored to the particulars of their applications.

We will consider the six scenarios using three assumed cost structures. The cost structures are intended to illuminate various circumstances in which the per-unit cost of screening is high or low relative to the cost of the survey interview, with Cost Structures 1-3 reflecting increasing relative cost of screening. All cost components are expressed in interviewing hours:

Cost Structure 1: ${c^{'}}_{B} = 0. 05, {c^{″}}_{B} = 2 .05, c_{B} = 2.00$ and $c_{A} = 1.00$

Cost Structure 2: ${c^{'}}_{B} = 0. 20, {c^{″}}_{B} = 2 .20, c_{B} = 2.00$ and $c_{A} = 1.00$

Cost Structure 3: ${c^{'}}_{B} = 0. 50, {c^{″}}_{B} = 2 .50, c_{B} = 2.00$ and $c_{A} = 1.00.$

All reflect circumstances in which the hours per case for a cell-phone interview is about 2 times larger than the hours per case for a landline interview.

Efficiencies corresponding to the various scenarios for the first cost structure are illustrated in Figure 4.1. We have prepared similar figures for the second and third cost structures, but to conserve space we do not present them here.

Figure 4.1 of section 4 Optimum allocation for a dual-frame telephone survey

Description for Figure 4.1

Figure showing a plot of efficiency against mixing parameter $p$ for the six scenarios, given the first cost structure. Efficiency $E$ is on the y-axis, going from 0.50 to 0.95. Parameter $p$ is on the x-axis, going from 0.05 to 0.95. The curves are convex. Efficiency is minimal when $p = 0.05$ for scenarios 3 to 6 and when $p = 0. 95$ for scenarios 1 and 2. The maximal efficiency is about 0.92 for scenario 1 and 0.89 for scenario 2, when $p = 0. 45 .$ For scenario 3, it’s about 0.82 when $p = 0.50.$ For scenarios 4, 5 and 6, the maximal efficiency is about, respectively, 0.80, 0.79 and 0.75 when $p = 0.55.$

Given Cost Structure 1, the screening approach achieves the lower variance for the same fixed cost for all six scenarios. Given Cost Structure 3, in which the per-unit cost of screening is relatively much higher than in Cost Structure 1, the take-all approach achieves a smaller variance than the screening approach for half of the population scenarios. For Cost Structure 2, which entails an intermediate level of screening cost, the screening approach beats the take-all approach for all scenarios except for Scenario 1, in which the two approaches are nearly equally efficient.

The comparison between the take-all and screening protocols can be understood by examining the form of efficiency $E$ in (4.1). The unit cost of screening is embedded only within the term $\sqrt{{c^{‴}}_{B}} R_{B}$ in the numerator of $E .$ Thus, for a given scenario, the value of $E$ must increase with increasing screening cost. For smaller screening costs, $E$ may be less than 1.0 in which case the screening protocol will be preferred, while for larger screening costs, $E$ may exceed 1.0 in which case the take-all protocol will be preferred.

It is also of interest to examine how the efficiency $E$ varies with the domain means (i.e., the domain proportions), given a fixed cost structure. We see in (4.1) and in the definitions of the variance components that as long as the domain $means - {\bar{Y}}_{b}, {\bar{Y}}_{a b},$ and ${\bar{Y}}_{a} - vary$ reasonably together, as they do in our scenarios, the variation has relatively little or no impact on $Q_{A}^{2}, Q_{B}^{2},$ and $R_{A}^{2},$ and $E$ will tend to vary more directly with $R_{B}^{2},$ and in turn with the value of the ratio ${\bar{Y}}_{b}^{2} / S_{b}^{2}$ in the CPO domain. The smaller the mean in the CPO domain, the smaller this ratio will be, and in turn the smaller $E$ will be. Thus, in each of the structures, we see smaller values of $E$ in Scenarios 5 and 6 than in Scenarios 1 and 2, and intermediate values of $E$ in Scenarios 3 and 4.

For the take-all protocol, the optimum $p ’s$ are located at the points at which the efficiencies reach their maximum values. Table 4.2 reveals the optimum sample sizes and the optimum parameters $p$ for each scenario and cost structure, assuming a fixed cost budget of 1,000 interviewing hours. For the screening protocol, we expect to complete $(1 - β) n_{B}$ cell-phone interviews. For all population scenarios and cost structures studied here, the screening protocol obtains fewer completed cell-phone interviews than does the take-all protocol. The latter design uses resources for interviewing dual-user cases in both of the samples and requires more cell-phone interviews to provide adequate representation of CPO cases, while the former design can be more efficient about interviewing CPO cases at the price of using resources to conduct the requisite screening interviews. The optimum $p ’ s$ fall approximately in the range from 0.4 to 0.6 and the variance under the take-all protocol is fairly flat within this range. We examine this issue further in Section 4.2.

In summary, one may conclude from these illustrations that the screening approach is often more efficient than the take-all approach. As the cost of the screener increases relative to the cost of the interview, the outcome can tip in favor of the take-all approach. The take-all approach will be preferred for surveys in which the cost of the screener is relatively very high; otherwise, the screening protocol will be preferred. The screening approach will tend to be relatively more efficient for small values of the CPO domain mean than for large values of this mean.

Table 4.2
Sample sizes and optimum $p ’ s$ for the take-all and screening designs
Table summary
This table displays the results of Sample sizes and optimum XXXXX for the take-all and screening designs. The information is grouped by Cost Structure (appearing as row headers), Screening Design and Take-All Design, calculated using XXXXX units of measure (appearing as column headers).
Cost Structure	Screening Design				Take-All Design
Cost Structure	$n_{A}$	$n_{B}$	$(1 - β) n_{B}$	$p_{o p t}$	$n_{A}$	$n_{B}$
Scenario 1
1	494	747	234	0.45	337	331
2	469	641	201	0.45	337	331
3	431	505	159	0.45	337	331
Scenario 2
1	506	728	229	0.45	339	330
2	481	626	197	0.45	339	330
3	443	494	155	0.45	339	330
Scenario 3
1	583	615	193	0.50	344	328
2	559	533	167	0.50	344	328
3	520	425	134	0.50	344	328
Scenario 4
1	605	582	183	0.55	377	312
2	581	506	159	0.55	377	312
3	543	405	127	0.55	377	312
Scenario 5
1	606	581	182	0.55	358	321
2	582	505	159	0.55	358	321
3	544	404	127	0.55	358	321
Scenario 6
1	618	563	177	0.55	354	323
2	594	490	154	0.55	354	323
3	557	393	123	0.55	354	323

4.2 Choosing the mixing parameter $p$ for the take-all protocol

The optimum allocation is defined in terms of the mixing parameter, and thus it is important to consider the choice of this parameter. In the foregoing section, we saw that variance is likely not very sensitive to the choice of $p$ within a reasonable neighborhood of optimum $p .$ While the actual optimum $p$ will never be known in practical applications, in this section, we describe a practical method that statisticians may use to select a reasonable, near-optimum value of $p .$

The landline and cell-phone samples each supply an estimator of the total in the dual-user domain, and the mixing parameter $p$ is used to combine the two estimators into one best estimator for this domain. When the estimator of the dual-user domain derived from the landline sample is the more precise, $p$ should be relatively large, and conversely, when the estimator from the cell-phone sample is the more precise, then $q = 1 - p$ should be relatively large. It makes good statistical sense to consider the value of $p$ that is proportional to the expected sample size in the dual-user domain, i.e., $p_{o} = α n_{A, opt} / (α n_{A, opt} + β n_{B, opt}),$ where the optimum allocation is based on this choice of $p .$ Thus, $p_{o}$ is a root of the equation

$\frac{c_{A} p^{2}}{c_{B} {(1 - p)}^{2}} = \frac{(1 - α) S_{a}^{2} + α p^{2} S_{a b}^{2} + α (1 - α) {({\bar{Y}}_{a} - p {\bar{Y}}_{a b})}^{2}}{(1 - β) S_{b}^{2} + β {(1 - p)}^{2} S_{a b}^{2} + β (1 - β) {{\bar{Y}}_{b} - (1 - p) {\bar{Y}}_{a b}}^{2}}, (4.2)$

and, in turn, $n_{A, o p t}$ and $n_{B, o p t}$ are defined in terms of $p_{o} .$

From (4.2) it is apparent that $p_{o}$ is a function of the $y -$ variable of interest. Use of this $p_{o}$ in actual practice could imply a different sample size and set of survey weights for each variable of interest, which would be unworkable. To provide a practicable solution, one might consider use of the $p_{o}$ that corresponds to the survey variable $y \equiv 1$ (the population total corresponding to this variable is simply the total number of unique units on the two sampling frames). Given this approach $p_{o}$ is a root of the equation

$\frac{c_{A} p^{2}}{c_{B} {(1 - p)}^{2}} = \frac{α (1 - α) {(1 - p)}^{2}}{β (1 - β) p^{2}} . (4.3)$

For the cost structures considered in this section, the corresponding $p_{o}$ is 0.52. In Figure 4.1, one can see that this value is very close to the exact optimum $p ’ s$ under the various scenarios, with little loss in efficiency. Alternatively, one could evaluate (4.2) for a small set of the most important items in the survey; choose a good compromise value of $p;$ and then define the optimum allocation in terms of this one compromise value.

Editorial policy

Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.

Submission of Manuscripts

Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).

Note of appreciation

Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.

Standards of service to the public

Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.

Copyright

Published by authority of the Minister responsible for Statistics Canada.

Catalogue no. 12-001-X

Frequency: semi-annual

Ottawa

Date modified:: 2017-09-20

Language selection

Search and menus

Search

Optimum allocation for a dual-frame telephone survey 4. Comparing the take-all and screening protocolsOptimum allocation for a dual-frame telephone survey 4. Comparing the take-all and screening protocols

4.1 Comparing the minimum variances and costs

4.2 Choosing the mixing parameter $p$ for the take-all protocol