5. Non-optimal properties of existing methods
Sun Woong Kim, Steven G. Heeringa and Peter W. Solenberger
Previous | Next
As described in Section
4, the algorithms for controlled selection may be divided into two parts, manual
algorithms before 1980s and computer-intensive algorithms since then. For large
controlled selection problems with many cells, the latter class of algorithms may
be preferred. But when the problem is small, the former can be easily used
without the complexity of the latter. Therefore we would not say that the
former is always inferior to the latter. More objective criteria for comparing
them would be necessary, and the optimal solution may be adopted as one of the
better criteria to compare their strengths or weaknesses.
As discussed by Jessen
(1978, pages 375-376), the algorithms of Jessen (1970) aim to minimize the
number of arrays in a solution set
,
and the algorithm of Jessen (1978) quite easily achieves that purpose relative to
those of Jessen (1970). Thus his algorithms pursue “simplicity” in formulating
a solution rather than an optimal solution.
The algorithm of Causey
et al. (1985) may give a “partially” optimal solution. Other than the original
problem,
, it
sequentially creates a small number of new controlled selection problems, and
then as a solution it finds only one array
to be nearest to each problem, starting with
. Each
problem is regarded as the transportation problem of Cox and Ernst (1982),
which is formed by the objective function mimicking the behavior of
Note that since function
(5.1) violates the triangle inequality axiom (iii), it is not a distance
function. It needs the inclusion of the
root to be a distance function. Also, each
is
calculated by a simple formula. In view of the optimality requirements given by
R1 and R2 the Causey et al. algorithm
has the following
weaknesses: 1) Since other controlled
selection problems in addition to the original problem
are involved, it is difficult to obtain the
solution consistently based on the closeness between the unique
and every individual
in
;
2) The maximization of the probabilities of selection for the arrays nearest to
is not guaranteed.
Winkler
(2001) presented a modification of the method of Causey et al. (1985). Instead
of using the transportation problem, he proposed integer linear programming,
resulting in slight changes of the
Nevertheless, the Winkler (2001) algorithm is
not free from the weaknesses of the Causey et al. (1985) method.
Adopting a network flow
problem approach, the Huang and Lin (1998) algorithm imposes the additional subgroup
constraints in
,
raised by Goodman and Kish (1950). However, it does not attain objectives R1 and
R2, just as in Causey et al. (1985) and Winkler (2001), since a new network,
instead of a new controlled selection problem, is generated at every iteration,
an arbitrary
is obtained as a solution to the network, and
is calculated by a simple formula.
In contrast, the LP
algorithms proposed by Sitter and Skinner (1994) and Tiwari and Nigam (1998) use
all possible arrays in
.
Note that finding all those arrays is an important issue, and that
for all possible arrays are simultaneously
obtained by running the software for LP only once. The key idea underlying the
algorithm of Sitter and Skinner (1994) is to use a “loss function” defined by
In terms of R1 and R2,
their algorithm has the following disadvantages: 1) The closeness between
and
is not well captured by loss function (5.2). This is
because it is not a distance function that satisfies axiom (iii), as the
marginal totals are used, instead of the cell entries; 2) Loss function (5.2)
is irrelevant to the maximization of the probabilities of selection over the arrays nearest to
in Problems 2.1, 2.2, and 2.4, since it is
always zero.
The LP method of Tiwari
and Nigam (1998) can be used to reduce the selection probabilities of
non-preferred arrays (e.g., arrays not containing the PSU corresponding to the
cell
in Problem 2.1), which are initially
determined by the samplers. For controlled selection problems with integer
margins and without considering the non-preferred arrays, their method will give
the same solutions as that of Sitter and Skinner (1994).
The solutions from these
previous methods will be compared with those from the proposed method in
Section 6, on several examples in Section 8.
Previous | Next