Confidentiality Vetting Support: Dominance and Homogeneity using the tcensus function (Stata)

Release date: April 27, 2022

Welcome to Statistics Canada's Data Access Training Series. This video is part of the confidentiality vetting support series and presents examples of how to use different statistical software packages to perform the analyses required for researchers working with confidential data.

Today we are going to show you an example of how to use the homogeneity and dominance tests as well as NK and P-percent tests for the continuous income variables of the census using Stata and the "tcensus" function.

Dominance occurs when most of the contribution to the statistic comes from one or a few units (based on unweighted contributions). N-K and P-% rules are dominance rules.

The homogeneity rule aims to prevent the dissemination of statistics when respondents occupy a narrow range of values. "Tcensus" is a STATA function that enables all census confidentiality tests to be performed. It automatically produces all supporting documents required for a disclosure request. It was developed to facilitate disclosure requests for both researchers and analysts. If you're unsure of the location of this code, please reach out to your analyst. To use "tcensus", you first need to import the function into Stata. Then the ""tcensus"" command can be passed through like any other STATA command. The "tcensus" command is easy to use.

The first variable after the command is the interest variable measured in dollars.

"Household (frame_id)" " and "weight (compw2)" are the household identifier and the weight variable. The "group" option allows you to identify the categorical or ordinal variables used to define the populations of interest.

Finally, you will need to identify the location of the support document by replacing "path" with the appropriate folder on your computer session.

Note that the path should not be put within quotation marks.

Here is an example of the "tcensus" command. This example simulates a request where a researcher is interested in the average individual income grouped by province and sex.

We'll start this example by importing our dummy census.

First, we will import the tcensus function. Then we can just use the command.

The results will be saved in the specified folder.

The first columns indicate the variables used to define the subpopulations of interest.

The columns named "test" are indicators for the various tests performed.

The value "Fail" will be indicated if any of the tests failed.

The following columns contain the values of the tests.

Please refer to the census confidentiality rules for more information on each of the tests or to the survey specific guidelines. Thank you for watching! If you have any questions, please contact your analyst or sent an email to:

