Statistics Canada
Symbol of the Government of Canada

Workshops

Workshop 1: Statistical Disclosure Control: A Risk-Utility Framework
Workshop 2: Record Linkage Methods: Theory and Application with G-Link
Workshop 3: Developments in Small Area Estimation: Methods, Applications and Software Development

Three full-day workshops will take place on Tuesday November 1st.

Workshop 1: Statistical Disclosure Control: A Risk-Utility Framework

Natalie Shlomo, University of Southampton

(English presentation with simultaneous translation and with French and English documents)

Aim

To provide an understanding of the motivation and issues involved in protecting confidentiality in statistical outputs including: disclosure control methods for different types of outputs, quantifying disclosure risk and data utility and the Disclosure Risk – Data Utility framework.

Lecture Programme

Introduction and motivation of statistical disclosure control for statistical outputs.
Measuring and quantifying disclosure risk for tabular data and microdata based on realistic scenarios of risk.

Disclosure control methods for tabular data and microdata.
Measuring and quantifying the impact and the effect of disclosure control methods on the quality of the data.

Reading List

Workshop 2: Record Linkage Methods: Theory and Application with G-Link

Eric Hortop and Abdelnasser Saidi, Statistics Canada

(French presentation with simultaneous translation and with French and English documents)

Record linkage is being used more and more frequently in many fields of study: the electronic maintenance of information registries, health and epidemiology, and the elimination of duplicates in order to create survey frames.

Record linkage consists of matching records that contain information on persons, businesses, or households when no unique identifier is available. Within this framework, the records are linked according to which data fields they have in common. This workshop focuses in detail on probabilistic record linkage based on the Fellegi-Sunter theoretical model, in which pairs of records are classified as linked, unlinked, or possibly linked, which have to be resolved manually. At Statistics Canada, this type of record linkage is used frequently. To facilitate its use, Statistics Canada has implemented its own probabilistic record linkage program, called “G-Link”, that is based on the Fellegi-Sunter algorithm. This workshop will provide a complete review of the methodology used in conjunction with concrete examples developed using G-Link. Finally, there will be a discussion on recent research and innovations in this area.

Workshop 3: Developments in Small Area Estimation: Methods, Applications and Software Development

Mike Hidiroglou, Statistics Canada, J.N.K. Rao, Carleton University and Victor Estevao, Statistics Canada

(English presentation with simultaneous translation and with French and English documents)

In recent years, demand for reliable small area estimates has greatly increased worldwide due to, among other things, their growing use in formulating policies and programs and the allocation of government funds, regional planning, small area business decisions and similar applications. Traditional area-specific direct estimators may not provide acceptable precision for small areas because sample sizes in small areas are seldom large enough. This makes it necessary to "borrow strength" across related areas through indirect estimators based on implicit or explicit linking models, using auxiliary information such as recent census data and current administrative data. Methods based on explicit linking models are now widely accepted. The need to provide reliable small area statistics has led to considerable methodological developments in the past 15 years or so on model-based indirect estimation that "borrows strength" from related areas and thus increase the "effective" sample sizes in the small areas.

In this workshop, we will concentrate on recent developments for small area estimation that are based on area and unit levels models. We will illustrate these results via examples published in the literature, or that are based on Canadian data available at Statistics Canada. We will also demonstrate how these results can be obtained using software for small area estimation developed at Statistics Canada.

References

Rao J. N. K. (2003). Small Area Estimation. Wiley.

Methodology Software Library Small-Area Estimation Fay-Herriot Area Level Model with EBLUP Estimation Methodology Specifications. February 2011.

Methodology Software Library Small-Area Estimation Unit Level Model with EBLUP and Pseudo EBLUP Estimation Methodology Specifications. March 2011.

Reading list for workshop 1

It will be assumed that you are familiar with the nature of survey data collection and the basic issues associated with confidentiality when collecting data. An elementary introduction to these issues is available on the American Statistical Association website in a brochure on “Surveys and Privacy”.

No prior knowledge of statistical disclosure control will be assumed, but if you wish to familiarize yourself with some of the ideas beforehand, there is much material on the Privacy, Confidentiality and Data Security pages of the American Statistical Association website.

A good introduction to statistical disclosure control is in:

"Statistical Policy Working Paper 22 - Report on Statistical Disclosure Limitation Methodology." (2nd version, 2005)
which may be downloaded. In particular, Chapter 2, “Statistical Disclosure Limitation: A Primer”, might be useful pre-course reading.

Further Reading

Willenborg, L. and de Waal, T. (2001) Elements of Statistical Disclosure Control. New York: Springer. 261 pages.

Doyle, P., Lane, J., Theeuwes, J. and Zayatz, L. eds. (2001) Confidentiality, Disclosure and Data Access. Amsterdam: North Holland.

J, Nin and J. Herranz, eds. (2010) Privacy and Anonymity in Information Management Systems. London, Springer.

Hundepool, A., Domingo-Ferrer, J., Franconi, L., Giessing, S., Lenz, R., Longhurst, J., Schulte-Nordholt, E., Seri, G. and De Wolf, P.P (2009) Handbook on Statistical Disclosure Control, Version 1.1. Prepared for the European ESSNET SDC Project

The following websites contain many of the latest developments in Statistical Disclosure Control:

United Nations Economic Commission for Europe (UNECE) websites:

http://www.unece.org/stats/documents/2009.12.confidentiality.htm
http://www.unece.org/stats/documents/2007.12.confidentiality.htm
http://www.unece.org/stats/documents/2005.11.confidentiality.htm

SDC Projects developed through a collaboration of European NSIs and Universities and Eurostat:

  • CASC project website
  • and through the CASC project website, the CENEX project website.
  • and the ESSNET project website.