A few remarks on a small example by Jean-Claude Deville regarding non-ignorable non-response Section 6. DiscussionA few remarks on a small example by Jean-Claude Deville regarding non-ignorable non-response Section 6. Discussion

Deville’s example is especially welcome since, for both models, the three estimation methods provide exactly the same estimators. Obviously, if the model is more complicated, using the maximum likelihood method becomes cumbersome, if not impossible. The calibration and generalized calibration method works in all cases as long as the number of calibration variables whose totals are known is sufficient and the matrix

$\sum_{k \in R} x_{k} z_{k}^{Τ}$

is invertible. In this example, the determinant of this matrix appears in the denominator of the estimators. Therefore, a small determinant makes the estimates especially risky. Lesage and Haziza (2015) recommend verifying that the correlations between variables $x_{k}$ and $z_{k}$ are great enough to avoid potentially amplifying the bias.

If the variables are quantitative, the solutions will depend on the calibration function used $F (.).$ The use of the calibration function $F (z_{k}^{Τ} λ) =1 + \exp (z_{k}^{Τ} λ)$ is recommended, since it has the advantage of providing weights greater than 1. The inverse of the weights can now be interpreted as a response probability estimated using a logistic model.

The main difficulty is obviously choosing between the two proposed models. In Deville’s example, it may seem more “logical” to see the non-response depend rather on drug use than on gender. However, we are not well equipped to make a choice between the two models. The values of the two likelihood functions for the estimated parameters are equal. Is it possible to choose the model based on more than a strong conviction? As suggested in Haziza and Lesage (2016), we recommend always calculating both weightings and comparing the weights and estimates obtained with each of them.

One option may be to calculate an indicator of the dispersion of the response probabilities, such as the variance. For example, if the variance is great, it means that the model has made it possible to calculate response probabilities with greater contrast between individuals and that the model has therefore taken better account of the non-response. Validation through a search for contrasting weights is the basis for identifying response homogeneity groups (RHGs) for all segmentation methods, for example with the chi-square automatic interaction detector (CHAID) algorithm developed by Kass (1980). For example, with CHAID, in each step the RHGs are split based on categories that result in response probabilities with the greatest contrast. By using the same principle in choosing the model, we can select the model that provides the weights with the greatest contrast. For example, if the variance is small, it means that the non-response model could not highlight the differences in non-response probabilities between individuals. Incidentally, the variance in response probabilities is the square of the R-indicator defined by Schouten, Cobben and Bethlehem (2009), used here to choose a non-response model.

In both cases, the average response probability equals 0.5. Specifically,

$\bar{p} = n_{H .} \frac{n_{H .} {\hat{p}}_{H} + n_{F .} {\hat{p}}_{F}}{n} = \frac{300 \times 0 .4 + 300 \times 0 .6}{600} = 0 .5$

and

$\bar{q} = {\hat{n}}_{. D} \frac{n_{. D} {\hat{q}}_{D} + {\hat{n}}_{. S} {\hat{q}}_{S}}{n} = \frac{300 \times 0 .2 + 300 \times 0 .8}{600} = 0 .5 .$

For the MAR model, the variance is

$V_{M A R} = \frac{n_{H .} {({\hat{p}}_{H} - \bar{p})}^{2} + n_{F .} {({\hat{p}}_{F} - \bar{p})}^{2}}{n} = \frac{300 {(0 .4 - 0 .5)}^{2} + 300 {(0 .6 - 0 .5)}^{2}}{600} = 0 .01 .$

For the NMAR model, the variance is

$V_{N M A R} = \frac{{\hat{n}}_{. D} {({\hat{q}}_{D} - \bar{q})}^{2} + {\hat{n}}_{. S} {({\hat{q}}_{S} - \bar{q})}^{2}}{n} = \frac{300 {(0 .2 - 0 .5)}^{2} + 300 {(0 .8 - 0 .5)}^{2}}{600} = 0 .09 .$

The greater variance of the NMAR model is an argument in its favour. In fact, the response probabilities show much greater contrast.

Acknowledgements

The author thanks Audrey-Anne Vallée for her meticulous proofreading of an earlier version of this text and an anonymous referee for their especially pertinent comments.

References

Chang, T., and Kott, P.S. (2008). Using calibration weighting to adjust for nonresponse under a plausible model. Biometrika, 95, 555-571.

Deville, J.-C. (2000). Generalized calibration and application to weighting for non-response. In Compstat - Proceedings in Computational Statistics: 14^th Symposium held in Utrecht, Netherlands, pages 65-76, New York: Springer.

Deville, J.-C. (2002). La correction de la nonréponse par calage généralisé. In the Actes des Journées de Méthodologie Statistique, Paris. Insee-Méthodes.

Deville, J.-C. (2004). Calage, calage généralisé et hypercalage. Technical report, internal document, INSEE, Paris.

Deville, J.-C. (2005). Calibration, past, present and future? Presentation at the conference: Calibration Tools for Survey Statisticians, Neuchâtel.

Deville, J.-C., and Särndal, C.-E. (1992). Calibration estimators in survey sampling. Journal of the American Statistical Association, 87, 376-382.

Haziza, D., and Lesage, E. (2016). A discussion of weighting procedures for unit nonresponse. Will appear in the Journal of Official Statistics.

Kass, G.V. (1980). An exploratory technique for investigating large quantities of categorical data. Applied Statistics, 119-127.

Kott, P.S. (2006). Using calibration weighting to adjust for nonresponse and coverage errors. Survey Methodology, 32, 2, 133-142. Paper available at http://www.statcan.gc.ca/pub/12-001-x/2006002/article/9547-eng.pdf.

Kott, P.S., and Chang, T. (2010). Using calibration weighting to adjust for nonignorable unit nonresponse. Journal of the American Statistical Association, 105(491), 1265-1275.

Lesage, E., and Haziza, D. (2015). On the problem of bias and variance amplification of the instrumental calibration estimator in the presence of unit nonresponse. Under revision for Journal of Survey Statistics and Methodology.

Schouten, B., Cobben, F. and Bethlehem, J. (2009). Indicators for the representativeness of survey response. Survey Methodology, 35, 1, 101-113. Paper available at http://www.statcan.gc.ca/pub/12-001-x/2009001/article/10887-eng.pdf.

ISSN : 1492-0921

Editorial policy

Survey Methodology publishes articles dealing with various aspects of statistical development relevant to a statistical agency, such as design issues in the context of practical constraints, use of different data sources and collection techniques, total survey error, survey evaluation, research in survey methodology, time series analysis, seasonal adjustment, demographic studies, data integration, estimation and data analysis methods, and general survey systems development. The emphasis is placed on the development and evaluation of specific methodologies as applied to data collection or the data themselves. All papers will be refereed. However, the authors retain full responsibility for the contents of their papers and opinions expressed are not necessarily those of the Editorial Board or of Statistics Canada.

Submission of Manuscripts

Survey Methodology is published twice a year in electronic format. Authors are invited to submit their articles in English or French in electronic form, preferably in Word to the Editor, (statcan.smj-rte.statcan@canada.ca, Statistics Canada, 150 Tunney’s Pasture Driveway, Ottawa, Ontario, Canada, K1A 0T6). For formatting instructions, please see the guidelines provided in the journal and on the web site (www.statcan.gc.ca/SurveyMethodology).

Note of appreciation

Canada owes the success of its statistical system to a long-standing partnership between Statistics Canada, the citizens of Canada, its businesses, governments and other institutions. Accurate and timely statistical information could not be produced without their continued co-operation and goodwill.

Standards of service to the public

Statistics Canada is committed to serving its clients in a prompt, reliable and courteous manner. To this end, the Agency has developed standards of service which its employees observe in serving its clients.

Copyright

Published by authority of the Minister responsible for Statistics Canada.

Use of this publication is governed by the Statistics Canada Open Licence Agreement.

Catalogue No. 12-001-X

Frequency: semi-annual

Ottawa

Date modified:: 2016-12-20

Language selection

Search and menus

Search

A few remarks on a small example by Jean-Claude Deville regarding non-ignorable non-response Section 6. DiscussionA few remarks on a small example by Jean-Claude Deville regarding non-ignorable non-response Section 6. Discussion

Acknowledgements

References