Decomposition of gender wage inequalities through calibration: Application to the Swiss structure of earnings survey
Section 5. The calibration approach
5.1 The calibration method
The
calibration method was introduced by Deville and Särndal (1992). The idea
behind the technique is to make use of the information known at the population
level on some auxiliary variables to estimate a function of a variable of
interest. Usually, the auxiliary variables and the variable of interest are
correlated. The resulting estimates are consistent and efficient.
Assuming
that the sampling weights
are available and that the totals of auxiliary
information at the population level given by
are known, new
weights
should be
constructed, such that the following constraint (or calibration equation) is
respected
The
weights are determined by solving in
the calibration equations that become
where
is the
calibration function. The resulting calibration estimation of
is
In
what follows, we will use the linear case, where the pseudo-distance function
is the chi-square distance and the calibration function is given by
In the second case, we will use the
raking-ratio, which uses the Entropy pseudo-distance and where the calibration
function is given by
5.2 Calibration of women’s characteristics on the
men’s characteristics
Suppose
that for all the units of the sample, there is a given sampling weight
In the current context, the auxiliary
variables that are used in the calibration process are some selected
characteristics measured for every individual. The aim is to ‘divert’ the
calibration technique in order to compute a weighting system that adjusts the
totals of the auxiliary variables of women on the totals of men. The variable
of interest is the logarithm of the wage.
In
the women sample, new weights
close to
are computed, such that
is minimized. The following calibration
equation is satisfied
where the vector
stores the
totals of men’s characteristics adjusted on the total of the weights of the
women over the total of the weights of the men.
Dividing
the calibration equation (5.3) by
yields
So with the new
weights
the new women’s
means of characteristics are equal to those of men. Another interesting
equality is
which holds because
and calibration
is performed on it. If
by putting together
equations (5.4) and (5.5), this means that
Women’s
counterfactual wage mean estimator is thus
5.3 Linear calibration
Result 2 Women’s
counterfactual wage mean obtained using linear calibration is equal to the
counterfactual wage mean obtained using the weighted BO method, i.e.,
Proof
In order to
determine the vector
in the case
when the chi-squared pseudo-distance is used, the following equation must be
solved
Thus,
where
Thus
Using the result
from the previous equation, the numerator of expression (5.2) becomes
where
denotes the
total of the logarithm of the wage in the women sample, when the total is
constructed using the chi-squared pseudo-distance. Let
Vector
has already
been defined in the same way in equation (3.1) for the weighted BO method.
Equation (5.7) is rewritten as
because under the
condition of Result 1,
By dividing (5.8)
by
Result 2
is obtained.
Using
the chi-squared pseudo-distance, the resulting weights have no bounds. This
means that the calibration weights might be negative. Even though this
calibration instance yields the same results as the BO method for average
wages, we advocate for the use of an instance that gives nonnegative weights.
5.4 Raking-ratio calibration
The
second instance of calibration uses the entropy pseudo-distance. It is also
known as “raking-ratio” calibration. Using the entropy pseudo-distance, equation (5.3)
becomes
This resulting
system of equations cannot be solved analytically. However, the value of
can be found
through the Newton-Raphson algorithm.
The
equation (5.2) can be now written as
where
denotes the
total of the logarithm of the wage in the women sample, when the total is
constructed using the raking-ratio calibration. The counterfactual wage mean of
women is written as
The equation above
is very similar to equation (4.4). The only difference lies in the
estimation of the parameters
and
The vector
contains the
Lagrangian multipliers solving equation 5.9 under constraint (5.1), while
the vector
is found
through maximum likelihood.
After
computing the calibration weights
defined in (5.3) and by using the information
in equation (5.6), it results that
which ensures that
the residual part of the structure effect defined in equation (4.8) will
equal 0. This is a solution to the problem shown in Section 4.3. This
instance of calibration also remedies the issue of the negative weights that
may arise when using the chi-squared pseudo-distance.