Domain sample allocation within primary sampling units in designing domain-level equal probability selection methods
7. Summary and concluding remarks Domain sample allocation within primary sampling units in designing domain-level equal probability selection methods
7. Summary and concluding remarks
In the design of
any survey, there is a need for good representation of analysis domains in the
sample. The sample allocation is not that simple because, unlike information
about indicators for commonly used strata available in the sampling frame,
domain indicators are generally not available or even if available, it may not
be practical to stratify by domains due to interviewer travel costs for
in-person surveys. What is needed is a method of sample allocation which allows
for desired over-(under-) sampling of domains such that the resulting design is
self-weighting or
for domains. Such designs are
desirable for variance efficiency in general. In the case of one phase two
stage designs, under certain assumptions, it is possible to allocate equal
interviewer workload per selected PSU such that the sample size for all
selected PSUs is controlled at the desired level, but domain sizes over all
selected PSUs satisfy the desired level only in expectation. On the other hand,
in the case of two phase two stage designs, it is possible to allocate domain
sample sizes within PSUs such that the domain sizes over all PSUs are
controlled at desired levels but the sample size per selected PSU is not
controlled as the equal interviewer workload per PSU is not deemed important in
this case. Although the
design of Kish at the population
level is well known, domain level
(denoted
in this paper) designs are not
well known among practitioners.
In this paper, we
considered two main scenarios for
designs considered by Folsom et al.
(1987). First, for two stage designs with known domain level PSU population
counts (as well as known frame-level domain identifiers for elementary units)
and pre-specified domain sample sizes, the PSU selection probabilities are
defined such that the desired PSU sample size (equal per PSU) is allocated to
domains within PSUs to obtain a
design. Second, for two phase two
stage designs with known PSU selection probabilities and pre-specified domain
sample sizes, the domain sampling rates are defined such that the desired
domain sample size is allocated to PSUs within domains to obtain a
design. These two designs were
referred to as
and respectively. A simple
justification of these two designs was provided. It is based on the key idea
for obtaining
designs that the sampling rate
at
the PSU by domain level should be made directly proportional to the domain
level sampling rate
but inversely proportional to the
PSU selection probability
For
is known but
is suitably defined (it was termed composite measure of size by Folsom et al. 1987),
while for
is known but
is suitably defined. The corresponding
stratified versions (denoted by can also be easily defined.
As a
generalization of
designs,
was proposed where domain-level
PSU population counts are only approximately known for specifying PSU selection
probabilities, but a two phase design is used to allocate desired domain sample
sizes to PSUs after obtaining the true domain-level population counts for
selected PSUs in the first phase. Also generalizations of
were
considered to obtain
when PSU selection probabilities
are pre-specified from other considerations. The
extends
to cover certain practical
realistic situations: 1) subsampling of elementary units within each
selected PSU in the first phase to reduce cost, and 2) nonresponse in
screening units for domain classification. The
design allows for stratification
in addition to practical features of
mentioned
above. For all
designs except for PSU sample
size is not directly controlled, but domain sample size is controlled via
stratification of the first phase before the second phase. This is not a
limitation in various practical applications where interviews are not conducted
face-to-face.
The initial
design framework of Folsom et al. (1987) to allocate equal probability samples for multiple domains in two-stage designs
in conjunction with one/two phase is a useful technique currently available in
the SUDAAN software system (http://www.rti.org/page.cfm/SUDAAN) and employed
successfully at RTI International for many years for studies such as the
National Survey of Child and Adolescent Well-Being. The generalizations
presented here extend the technique to the situation of multiple domains where
the domain-level population counts need to be estimated for all selected PSUs,
and where PSU selection probabilities are pre-specified from other
considerations. These techniques are expected to be useful to sampling
statisticians in a variety of situations.
Acknowledgements
The first author
is grateful to Ralph Folsom of RTI International for several useful
discussions. The authors thank Ned English of NORC at the University of Chicago
for his help in creating site areas for the toolbox application. Thanks are
also due to the associate editor and two referees for their very useful
comments which helped improve clarity and presentation of the paper.
References
Cochran, W. (1977). Sampling Techniques, 3rd Ed. New York: John
Wiley & Sons, Inc.
Eltinge, J. (2011). Personal Communication.
Fahimi, M., and Judkins, D. (1991). PSU Probabilities
given differential sampling at second stage. Proceedings of the Section on Survey Research Methods, American
Statistical Association, 538-543.
Folsom, R.E., Potter, F.J. and Williams, S.R. (1987).
Notes on a composite size measure for self-weighting samples in multiple
domains. Proceedings of the Section on
Survey Research Methods, American Statistical Association, 792-796.
Harter, R., Eckman, S., English, N. and
O’Muircheartaigh, C. (2010). Applied sampling for large-scale multi-stage area probability
designs. In Handbook of Survey Research,
Second Edition, (Eds., P. Marsden and J. Wright), Emerald: United
Kingdom.
Kish, L. (1965). Survey Sampling. New York: John
Wiley & Sons, Inc.
Lohr, S. (2010). Sampling: Design and Analysis, 2nd Ed. Boston: Brooks/Cole.
Singh, A.C., and Harter, R.M. (2011). A generalized
epsem two-phase design for domain estimation. Proceedings of the Section on Survey Research Methods, American
Statistical Association, 3269-3282.
Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.
Submission of Manuscripts
Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).
Note of appreciation
Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.
Standards of service to the public
Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.
Copyright
Published by authority of the Minister responsible for Statistics Canada.