RTRA parameters

Program name

Program content: Statistics

In order for the RTRA to automate the processes necessary for confidentiality, your programs need to be written in a standard format. To write a SAS program in the correct format, users need to apply information from the RTRA parameters document and create statistics by calling standard RTRA macros. Terms and usage of these standards are explained below.

## RTRA parameters

The RTRA parameters contains essential information that users will require to write their SAS programs. The terms in this document are explained below.

**SAS Tag Name -** A tag name is a unique reference term for each survey library available through the RTRA system. To ensure access to the correct survey library, the tag name must be referenced in the title of your SAS program. Please refer to Program name for directives on the correct naming convention of your SAS program.

**SAS datasets -** The SAS dataset name must be referenced using the standard libname called RTRAData. To ensure access to the correct survey dataset, please refer to the RTRA parameters page for the complete list of dataset names.

**Rounding base -** Frequencies are rounded in accordance with the rounding base specified for each survey dataset. The rounding base is developed using information on the weight distribution, minimum-respondent rules and existing rounding practice for each survey dataset.

**Variables renamed -** For RTRA compatibility reasons, certain variables are renamed.

**Deleted variables -** Sensitive variables that pose a disclosure risk are deleted from the microdata files.

**Weight -** Weight variables for each survey dataset are made available in this document. Although sample weights do not exist for administrative datasets, a standard name of “WEIGHT” must be inputted for the RTRA system to pass in the macro. This “WEIGHT” variable is equal to 1 for administrative data files.

**Execution time limit -** The execution time limit specifies the maximum time length for running the program submission. This limit prevents the SAS program from running for an excessive amount of time and consuming unnecessary computing resources.

## Program name

To ensure access to the correct survey library, the tag name must be referenced in the title of your SAS program. Please refer to the RTRA parameters for the complete list of tag names.

You will need to name your SAS program starting with the "Tag Name" followed by an underscore. Following that underscore, you may call the program anything you like. For example, researchers submitting a program using the 2006 General Social Survey should name their program: GSS2006_anynameyouwant.sas. Please note that the program name cannot exceed 70 characters or include the characters & and %.

## Program content: Statistics

Please ensure your program follows the structure in the sections below.

### Part 1: Program element

- Users need to reference a standard libname called RTRAData. The list of corresponding dataset names can be found in the RTRA parameters.
- Do not attempt to create a SAS libref; including a libname statement will result in the termination of your program.
- In this section you can massage the data using "proc sort" and "data steps".
- When using the "keep" statement or "keep=" dataset option in SAS, you must include the 'ID' variable.

### Part 2: Statistics

In this section, tabulations are created by calling the custom RTRA procedures macros. You can call these procedures a maximum of 10 times per program.

There are three types of statistics that can be calculated in RTRA:

- 1.
**Basic Statistics:**These statistics calculate only one statistic at a time. The basic statistics available in the RTRA system are: frequency, mean, percentiles, percent distribution, proportions, ratio and share. - 2.
**Level 5 (L5) Statistics:**Also known as higher‐order statistics, these statistics calculate differences between the basic statistics available in the RTRA system.- There are three different types of L5 statistics:
- 1.
**Level Change (LC):**Level change is defined as the difference between the values of the statistic calculated within a table. - 2.
**Percent Change (PC):**Percentage change is defined as the percent difference between the values of the statistic within a table. It is calculated by taking the difference of two values within a table and dividing by the original value. - 3.
**Significance Test (ST):**Significance tests calculate whether two values in a table have a difference that is statistically significant.

- 1.
- There are three methods of calculating L5 statistics. These methods refer to how the values in the table’s cells are compared to one another:
- 1.
**Global:**For a global L5 statistic, every value in a cell is compared to the value for the entire domain that encompasses these cells. - 2.
**Base Value:**A base value L5 statistic compares the value of every cell with another specified cell (the base value). - 3.
**Sequential:**A sequential L5 statistic compares the value of every cell with the value of the cell directly below it in the table. Note: The order of the domains in a table matters when using a sequential L5 statistic.

- 1.

- There are three different types of L5 statistics:
- 3.
**Level 5 Sequential Over Time (L5SOT) Statistics:**Also known as higher‐order statistics, these statistics calculate differences between the basic statistics available in the RTRA system. L5SOT statistics compare the value of every cell with the value of the cell directly below it in the table in a sequential manner over time. As such, a string of time needs to be identified in the macro so that the sequence can be shown; these time records can be yearly (L5YrVar), monthly (L5MonVar), quarterly (L5QtrVar) or a set time interval (L5TimeInt). Note: The order of the domains in a table matters when using L5SOT statistics.- There are three different types of L5SOT statistics:
- 1.
**Level Change (LC):**Level change is defined as the difference between the values of the statistic calculated within a table. - 2.
**Percent Change (PC):**Percentage change is defined as the percent difference between the values of the statistic within a table. It is calculated by taking the difference of two values within a table and dividing by the original value. - 3.
**Significance Test (ST):**Significance tests calculate whether two values in a table have a difference that is statistically significant.

- 1.

- There are three different types of L5SOT statistics:

Both L5 and L5SOT statistics require a basic statistic to be calculated before they can be used. As such, there is a field within the L5 and L5SOT macros where the basic statistic is identified.

- Frequency for RTRA
- Mean for RTRA
- Percentiles for RTRA
- Percent Distribution for RTRA
- Proportions for RTRA
- Ratio for RTRA
- Share for RTRA