Disclosure control and data dissemination

Filter results by

Search Help
Currently selected filters that can be removed

Keyword(s)

Type

1 facets displayed. 0 facets selected.

Content

1 facets displayed. 0 facets selected.
Sort Help
entries

Results

All (10)

All (10) ((10 results))

  • Articles and reports: 11-522-X202200100007
    Description: With the availability of larger and more diverse data sources, Statistical Institutes in Europe are inclined to publish statistics on smaller groups than they used to do. Moreover, high impact global events like the Covid crisis and the situation in Ukraine may also ask for statistics on specific subgroups of the population. Publishing on small, targeted groups not only raises questions on statistical quality of the figures, it also raises issues concerning statistical disclosure risk. The principle of statistical disclosure control does not depend on the size of the groups the statistics are based on. However, the risk of disclosure does depend on the group size: the smaller a group, the higher the risk. Traditional ways to deal with statistical disclosure control and small group sizes include suppressing information and coarsening categories. These methods essentially increase the (mean) group sizes. More recent approaches include perturbative methods that have the intention to keep the group sizes small in order to preserve as much information as possible while reducing the disclosure risk sufficiently. In this paper we will mention some European examples of special focus group statistics and discuss the implications on statistical disclosure control. Additionally, we will discuss some issues that the use of perturbative methods brings along: its impact on disclosure risk and utility as well as the challenges in proper communication thereof.
    Release date: 2024-03-25

  • Articles and reports: 12-001-X202300100007
    Description: I provide an overview of the evolution of Statistical Disclosure Control (SDC) research over the last decades and how it has evolved to handle the data revolution with more formal definitions of privacy. I emphasize the many contributions by Chris Skinner in the research areas of SDC. I review his seminal research, starting in the 1990’s with his work on the release of UK Census sample microdata. This led to a wide-range of research on measuring the risk of re-identification in survey microdata through probabilistic models. I also focus on other aspects of Chris’ research in SDC. Chris was the recipient of the 2019 Waksberg Award and sadly never got a chance to present his Waksberg Lecture at the Statistics Canada International Methodology Symposium. This paper follows the outline that Chris had prepared in preparation for that lecture.
    Release date: 2023-06-30

  • Articles and reports: 11-522-X200600110434
    Description:

    Protecting respondents from disclosure of their identity in publicly released survey data is of practical concern to many government agencies. Methods for doing so include suppression of cluster and stratum identifiers and altering or swapping record values between respondents. Unfortunately, stratum and cluster identifiers are usually needed for variance estimation using linearization and for replication methods as resampling is typically done on first-stage sampling units within strata. One might feel that releasing a set of replicate weights that also have stratum and cluster identifiers suppressed might circumvent this problem to some extent, especially using some random resampling such as the bootstrap. In this article, we first demonstrate that by viewing the replicate weights as observations in a high dimensional space one can easily use clustering algorithms to reconstruct the cluster identifiers irrespective of the resampling method even if the resampling weights are randomly altered. We then propose a fast algorithm for swapping cluster and strata identifiers of ultimate units before creating replicate weights without significantly impacting resulting variance estimates of characteristics of interest. The methods are illustrated by application to publicly released data from the National Health and Nutrition Examination Surveys, where such disclosure issues are extremely important..

    Release date: 2008-03-17

  • Articles and reports: 11-522-X20050019460
    Description:

    Users will analyse and interpret the time series of estimates in various ways often involving estimates for several time periods. Despite the large sample sizes and degree of overlap between the sample for some periods the sampling errors can still substantially affect the estimates of movements and functions of them used to interpret the series of estimates. We consider how to account for sampling errors in the interpretation of the estimates from repeated surveys and how to inform the users and analysts of their possible impact.

    Release date: 2007-03-02

  • Articles and reports: 11-522-X20050019463
    Description:

    Statisticians are developing additional concepts for communicating errors associated with estimates. Many of these concepts are readily understood by statisticians but are even more difficult to explain to users than the traditional confidence interval. The proposed solution, when communicating with non-statisticians, is to improve the estimates so that the requirement for explaining the error is minimised. The user is then not confused by having too many numbers to understand.

    Release date: 2007-03-02

  • Articles and reports: 11-522-X20050019483
    Description:

    All member countries in Europe face similar problems with respect to Statistical disclosure control (SDC). They all need to find a balance between preservation of privacy for the respondents and the very legitimate requests of society, researchers and policy makers to provide more and more detailed information. This growing demand, due to developments of the information age and knowledge society is a common problem of the European Statistical System (ESS).In the paper current Eurostat confidentiality issues and strategy are discussed and is described a European SDC approach through the establishment of a Centres and Networks of Excellence (CENEX).

    Release date: 2007-03-02

  • Articles and reports: 11-522-X20030017691
    Description:

    This paper explains how results of European research projects on statistical disclosure control (SDC) can be used in the production of official statistics. It describes two related software packages for producing safe data: tau-ARGUS for tabular data, and mu-ARGUS for microdata.

    Release date: 2005-01-26

  • Articles and reports: 11-522-X20030017692
    Description:

    This paper discusses regression servers, which are data dissemination systems that return some of the output generated by regression analyses in response to user queries. It details work on the special case where the data contain a sensitive variable whose regressions must be protected.

    Release date: 2005-01-26

  • Articles and reports: 12-001-X20030026784
    Description:

    Skinner and Elliot (2002) proposed a simple measure of disclosure risk for survey microdata and showed how to estimate this measure under sampling with equal probabilities. In this paper we show how their results on point estimation and variance estimation may be extended to handle unequal probability sampling. Our approach assumes a Poisson sampling design. Comments are made about the possible impact of departures from this assumption.

    Release date: 2004-01-27

  • Articles and reports: 11-522-X20010016286
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    It is customary for statistical agencies to audit tables containing suppressed cells in order to ensure that there is sufficient protection against inadvertent disclosure of sensitive information. If the table contains rounded values, this fact may be ignored by the audit procedure. This oversight can result in over-protection, reducing the utility of the published data. This paper provides correct auditing formulation and gives examples of over-protection.

    Release date: 2002-09-12
Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (10)

Analysis (10) ((10 results))

  • Articles and reports: 11-522-X202200100007
    Description: With the availability of larger and more diverse data sources, Statistical Institutes in Europe are inclined to publish statistics on smaller groups than they used to do. Moreover, high impact global events like the Covid crisis and the situation in Ukraine may also ask for statistics on specific subgroups of the population. Publishing on small, targeted groups not only raises questions on statistical quality of the figures, it also raises issues concerning statistical disclosure risk. The principle of statistical disclosure control does not depend on the size of the groups the statistics are based on. However, the risk of disclosure does depend on the group size: the smaller a group, the higher the risk. Traditional ways to deal with statistical disclosure control and small group sizes include suppressing information and coarsening categories. These methods essentially increase the (mean) group sizes. More recent approaches include perturbative methods that have the intention to keep the group sizes small in order to preserve as much information as possible while reducing the disclosure risk sufficiently. In this paper we will mention some European examples of special focus group statistics and discuss the implications on statistical disclosure control. Additionally, we will discuss some issues that the use of perturbative methods brings along: its impact on disclosure risk and utility as well as the challenges in proper communication thereof.
    Release date: 2024-03-25

  • Articles and reports: 12-001-X202300100007
    Description: I provide an overview of the evolution of Statistical Disclosure Control (SDC) research over the last decades and how it has evolved to handle the data revolution with more formal definitions of privacy. I emphasize the many contributions by Chris Skinner in the research areas of SDC. I review his seminal research, starting in the 1990’s with his work on the release of UK Census sample microdata. This led to a wide-range of research on measuring the risk of re-identification in survey microdata through probabilistic models. I also focus on other aspects of Chris’ research in SDC. Chris was the recipient of the 2019 Waksberg Award and sadly never got a chance to present his Waksberg Lecture at the Statistics Canada International Methodology Symposium. This paper follows the outline that Chris had prepared in preparation for that lecture.
    Release date: 2023-06-30

  • Articles and reports: 11-522-X200600110434
    Description:

    Protecting respondents from disclosure of their identity in publicly released survey data is of practical concern to many government agencies. Methods for doing so include suppression of cluster and stratum identifiers and altering or swapping record values between respondents. Unfortunately, stratum and cluster identifiers are usually needed for variance estimation using linearization and for replication methods as resampling is typically done on first-stage sampling units within strata. One might feel that releasing a set of replicate weights that also have stratum and cluster identifiers suppressed might circumvent this problem to some extent, especially using some random resampling such as the bootstrap. In this article, we first demonstrate that by viewing the replicate weights as observations in a high dimensional space one can easily use clustering algorithms to reconstruct the cluster identifiers irrespective of the resampling method even if the resampling weights are randomly altered. We then propose a fast algorithm for swapping cluster and strata identifiers of ultimate units before creating replicate weights without significantly impacting resulting variance estimates of characteristics of interest. The methods are illustrated by application to publicly released data from the National Health and Nutrition Examination Surveys, where such disclosure issues are extremely important..

    Release date: 2008-03-17

  • Articles and reports: 11-522-X20050019460
    Description:

    Users will analyse and interpret the time series of estimates in various ways often involving estimates for several time periods. Despite the large sample sizes and degree of overlap between the sample for some periods the sampling errors can still substantially affect the estimates of movements and functions of them used to interpret the series of estimates. We consider how to account for sampling errors in the interpretation of the estimates from repeated surveys and how to inform the users and analysts of their possible impact.

    Release date: 2007-03-02

  • Articles and reports: 11-522-X20050019463
    Description:

    Statisticians are developing additional concepts for communicating errors associated with estimates. Many of these concepts are readily understood by statisticians but are even more difficult to explain to users than the traditional confidence interval. The proposed solution, when communicating with non-statisticians, is to improve the estimates so that the requirement for explaining the error is minimised. The user is then not confused by having too many numbers to understand.

    Release date: 2007-03-02

  • Articles and reports: 11-522-X20050019483
    Description:

    All member countries in Europe face similar problems with respect to Statistical disclosure control (SDC). They all need to find a balance between preservation of privacy for the respondents and the very legitimate requests of society, researchers and policy makers to provide more and more detailed information. This growing demand, due to developments of the information age and knowledge society is a common problem of the European Statistical System (ESS).In the paper current Eurostat confidentiality issues and strategy are discussed and is described a European SDC approach through the establishment of a Centres and Networks of Excellence (CENEX).

    Release date: 2007-03-02

  • Articles and reports: 11-522-X20030017691
    Description:

    This paper explains how results of European research projects on statistical disclosure control (SDC) can be used in the production of official statistics. It describes two related software packages for producing safe data: tau-ARGUS for tabular data, and mu-ARGUS for microdata.

    Release date: 2005-01-26

  • Articles and reports: 11-522-X20030017692
    Description:

    This paper discusses regression servers, which are data dissemination systems that return some of the output generated by regression analyses in response to user queries. It details work on the special case where the data contain a sensitive variable whose regressions must be protected.

    Release date: 2005-01-26

  • Articles and reports: 12-001-X20030026784
    Description:

    Skinner and Elliot (2002) proposed a simple measure of disclosure risk for survey microdata and showed how to estimate this measure under sampling with equal probabilities. In this paper we show how their results on point estimation and variance estimation may be extended to handle unequal probability sampling. Our approach assumes a Poisson sampling design. Comments are made about the possible impact of departures from this assumption.

    Release date: 2004-01-27

  • Articles and reports: 11-522-X20010016286
    Description:

    This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.

    It is customary for statistical agencies to audit tables containing suppressed cells in order to ensure that there is sufficient protection against inadvertent disclosure of sensitive information. If the table contains rounded values, this fact may be ignored by the audit procedure. This oversight can result in over-protection, reducing the utility of the published data. This paper provides correct auditing formulation and gives examples of over-protection.

    Release date: 2002-09-12
Reference (0)

Reference (0) (0 results)

No content available at this time.

Date modified: