North American Product Classification System (NAPCS) Canada 2012 Version 1.2
Introduction
NAPCS Canada 2012 Version 1.2 updates NAPCS Canada 2012 Version 1.1. Some categories were split, and others were merged. New categories were incorporated, and some were deleted, for a net decrease of 79 product categories at different levels, providing better relevancy to statistical programs and users. Most of the changes, however, were editorial. They relate to editing of category titles adding precision to their formulations. The detailed list of changes can be obtained from Standards Division at standards-normes@statcan.gc.ca.
Standard classification structure
The standard classification structure of NAPCS Canada 2012 comprises four levels: group, class, subclass, and detail. The table below outlines the nomenclature and provides the number of categories within each level of NAPCS Canada 2012 versions 1.2 and 1.1.
Standard classification structure of NAPCS Canada 2012
Record linkage proposals involve a significant amount of initial analysis and discussion to develop a formal project contract. Information is provided to Statistics Canada so that project feasibility can be assessed before the formal record linkage application for approval can begin. To facilitate the experience through all project steps, applicants are encouraged to familiarize themselves with the requirements for approval, understand their data sources and what is needed for record linkage, as well as giving thought early on about desired outputs and data access protocols.
Here are some questions to consider before starting a record linkage project at Statistics Canada:
For approval
Do you have a clear research question?
Do you have a research protocol? Do you have an ethics approval?
Does your study meet the expected results from Statistics Canada's Directive on Microdata Linkage? Can you clearly demonstrate how the public interest is served by your project and why a record linkage is the best means to achieve this public benefit? Are you sure that the data you expect from linkage are not otherwise available?
Feasibility of linkage
Have you examined the available source data file documentation to ensure that the variables of interest will serve your needs and that the sample (if applicable) is adequate for your study?
If applicable, does your microdata file have personal identifiers that would enable record linkage?
If your project requires an external microdata file, have you received permission from the data owner to provide it to Statistics Canada?
Final deliverables and access
Have you thought about the structure of your final linked analysis file? Are derived variables needed? Will you need to person-orient records? How will different reference periods be treated? Will you be using longitudinal data?
What are your plans to identify and address the data quality issues associated with record linkage and with the specific sources that you have in mind?
Linkages to the Derived Record Depository (DRD)Table Note 1 as of June 2025 Table summary
This table displays the files linked to the Derived Record Depository (DRD). The information is shown by Source (appearing as row headers) and by Years/Cycles (appearing as column headers).
Source
Years/Cycles
Contributors (files that add individuals to the DRD)
Canadian COVID-19 Antibody and Health Survey - Cycle 2
April to August 2022
Canada Student Financial Assistance Program
2009 to 2021
Rio Tinto
1950 to 2019
Table Note 1
Derived Record Depository (DRD) is a national longitudinal data base of individuals derived from a number of Statistics Canada data files and containing only basic personal identifiers. Note that there may be limitations on the use of the sources listed.
The purpose of the SDLE program is to facilitate pan-Canadian social and economic statistical research. It is a record linkage environment that:
increases the relevance of existing Statistics Canada surveys without collecting new data (including maintaining the relevance of completed longitudinal surveys);
substantially increases the use of administrative data;
generates new information without additional data collection;
maintains the highest privacy and data security standards; and
promotes a standardized approach to record linkage processes and methods.
Benefits and public good
Fill data gaps: Studies conducted through the SDLE have the potential to address important information gaps related to the financial, social, economic and general activities and conditions of Canadians.
Reduce response burden: Through record linkage, important data needs in the analysis of social data can be met without incurring the cost or response burden of collecting new data.
Reduce record linkage costs: The SDLE process surrounding the preparation and management of files for record linkage is more efficient and timely through the use of a processing system and the retention of cumulative linkage results.
How it works
The SDLE is a highly secure environment that facilitates the creation of linked population data files for social analysis. It is not a large integrated data base.
At the core of the SDLE is a Derived Record Depository (DRD), essentially a national dynamic relational data base containing only basic personal identifiers. The DRD is created by linking selected Statistics Canada source index filesDefintion 4 for the purpose of producing a list of unique individuals. These files are brought into the environment, processed and linked only once to the DRD. Each individual in the DRD is assigned an SDLE identifier. Some of the source index files used to build the DRD include tax records, vital statistics registration records (births and deaths), and immigrant data. Updates to these data files are linked to the DRD on an ongoing basis.
Only basic personal identifiers are stored in the DRD. Examples of personal identifiers stored in the DRD include surnames, given names, date of birth, sex, insurance numbers, parents' names, marital status, addresses (including postal codes), telephone numbers, immigration date, emigration date and date of death.
The paired SDLE identifiers and source index file record IDs resulting from the record linkage are stored in a Key RegistryDefintion 2. All source index files are linked to the DRD either probabilistically using a generalized software tool (G-Link) or deterministically using SAS scripts.
Deterministic record linkage involves matching records based on unique identifiers shared by both files. On the other hand, probabilistic record linkage works with non-unique identifiers (e.g. names, sex, date of birth and postal code) and estimates the likelihood that records are referring to the same entity.
Once a study requiring linked data has been defined and approved, the associated record IDs (extracted from the Key Registry) are used to find the individual records in the source data filesDefintion 3. Selected variables from these sources can then be integrated into a linked analysis file. This approach provides a virtual linkage environment that eliminates the need to build a large integrated data base.
Figure 1. Social Data Linkage Environment overview diagram
Description for Figure 1: Social Data Linkage Environment overview diagram
This figures is a visual model that serves as a summary of the text of this overview page.
Within the secure data environment at Statistics Canada, source files are separated into Source Data Files (record IDs and analysis variables without personal identifiers) and Source Index Files (record IDs and personal identifiers without analysis variables).
The Source Index Files are accessed within the record linkage production environment and linked to the Derived Record Depository (national longitudinal file of personal identifiers). The linked SDLE and record IDs are stored in the Key Registry (record IDs used as keys to find only those records needed for study).
The Source Data Files are accessed within the linked analysis file production environment that uses keys from the Key Registry to create analysis files for approved studies only and with no personal identifiers.
The SDLE program is governed by the Statistics Canada senior management. The Chief Statistician reviews and approves each record linkage proposal, and if the study is approved by the Chief Statistician, an analysis file is created.
The output of this process is an Analytical Product (non-confidential aggregate data).
Data sources
The Derived record depository (DRD)Defintion 1 contains only record IDs and identifiers without analysis data. The principal source index filesDefintion 4 that contribute to build (i.e. add individual records) and update (i.e. provide additional information to existing records) the DRD include:
T1 Personal Master Files (tax);
Canadian Child Tax Benefits (CCTB) files;
Canadian Vital Statistics – Birth database;
Landing File; and
Canadian Vital Statistics – Death database.
Other sources will be used to create linked analysis files for approved projects (some of which may also be used to update the DRD). See DRD linkage status.
In the future, additional files could be linked to the DRD. These could be data already residing in Statistics Canada or external files brought in for specific approved research projects.
Statistics Canada has responsibility for securely storing and processing data. Because SDLE research projects involve the use of linked micro-records, approval by the Chief Statistician of Canada on a study-by-study basis is required in accordance with the Directive on Microdata Linkage. Summaries of approved record linkages are published on the Statistics Canada website.
Linked analysis files
When a research project requiring linked data from the SDLE has been approved and linked in the SDLE production environment, the record IDs for the specified cohort and the associated record IDs of the file(s) to be linked to the cohort are drawn from the Key RegistryDefintion 2. These record IDs are used to bring selected variables from the separate source data files together to create a linked analysis file.
Depending on the complexity of the source data file(s), decisions about how to structure the linked analysis file may be needed (e.g. working with multiple reference periods or with event-based files, etc.). Furthermore, the quality of the linked data must be assessed. Data that are linked in the SDLE will go through two kinds of validation:
Assessment of the record linkage: What is the match rate (%) with the DRDDefintion 1? Are the links valid? (False positive links? Missed links?)
Assessment of linked analysis file: Do the linked data appear to make sense from a subject-matter point of view? Any bias caused by the linkage process? Do they adequately represent the study population of interest?
These file structuring decisions and data quality measures will be documented and need to be taken into account in the final analysis.
Services
In addition to maintaining the SDLE and conducting new record linkages, the SDLE team provides support to clients as required including:
assessing project feasibility;
advising on data sources, analytical limitations, and validation;
liaising with subject-matter experts;
assistance with approval steps;
building custom linked analysis files; and
providing training and outreach.
Statistics Canada makes custom services, such as the SDLE, available to Canadian organizations on a cost-recovery basis. Cost-recovery means that clients pay for the direct and indirect cost of doing the work. Custom services are not funded by the budget that Parliament allocates to Statistics Canada. Costs reflect the requirements of each client and range depending on the complexity of the proposal.
Linked analysis files are deemed sensitive statistical information and subject to the confidentiality requirements of the Statistics Act. To reduce the risk of privacy intrusiveness and to minimize the risk of disclosure, source files in SDLE are separated into source index files and source data files. As well, the record linkage production environment that uses the source index files is separated from the data integration and analysis environment that uses the source data files. That is, Statistics Canada employees performing the record linkages in SDLE have access to only the basic personal identifiers needed for linkage. Employees who build the analytical files for research have access only to the data stripped of personal identifiers. Anonymous keys are used to integrate the data from the various sources into a linked analysis data file. Further, only Statistics Canada employees who have an approved need to access the data for their analytical work are allowed access to the linked analysis file. The privacy impact assessment conducted by Statistics Canada found these processes acceptable to reduce the risk of privacy intrusiveness and to minimize the risk of disclosure.
Definitions
Definition 1
Derived Record Depository (DRD) is a national longitudinal data base of individuals derived from a number of Statistics Canada data files and containing only basic personal identifiers.
Record linkage application process: steps to follow
Expanding data potential
The Social Data Linkage Environment (SDLE) at Statistics Canada promotes the innovative use of existing administrative and survey data to address important research questions and inform socio-economic policy through record linkage.
The SDLE expands the potential of data integration across multiple domains, such as health, justice, education and income, through the creation of linked analytical data files without the need to collect additional data from Canadians.
Protecting personal information
Statistics Canada takes your confidentiality very seriously. Under the Statistics Act, all information provided to Statistics Canada is kept confidential, and used only for statistical purposes.
Statistics Canada ensures the privacy and confidentiality and data security of all our programs. In addition to consulting with the Office of the Privacy Commissioner, Statistics Canada conducted a privacy impact assessment to address any potential issues relating to confidentiality or security with the work being undertaken through the SDLE.
Frequently asked questions
What are the benefits of using SDLE?
The SDLE environment offers a highly secure data infrastructure for record linkage activities. It increases efficiency through the use of a processing system, thus offering more timely results and lower costs. SDLE enables linkage across multiple data sets in the social domain which fills important data gaps and can contribute to new research and a better understanding of Canadian society. SDLE also aims to standardize processes, improve methods and enhance data quality.
What services are available?
Our services and supports include: assessing the feasibility of record linkage projects, offering advice on data sources, liaising with subject-matter experts, assisting with approval steps, conducting the record linkage, building custom linked analysis files according to client specifications, advising on analytical limitations and validation, and providing training and outreach.
What kind of linkages can be done in SDLE?
Any linkage of persons can be done in SDLE.
How does SDLE maintain privacy and confidentiality?
Linked analysis files are deemed sensitive statistical information and subject to the confidentiality requirements of the Statistics Act. To reduce the risk of privacy intrusiveness and to minimize the risk of disclosure, source files in SDLE are separated into source index files and source data files. As well, the record linkage production environment that uses the source index files is separated from the data integration and analysis environment that uses the source data files. That is, Statistics Canada employees performing the record linkages in SDLE have access to only the basic personal identifiers needed for linkage. Employees who build the analytical files for research have access only to the data stripped of personal identifiers. Anonymous keys are used to integrate the data from the various sources into a linked analysis data file. Further, only Statistics Canada employees who have an approved need to access the data for their analytical work are allowed access to the linked analysis file. The privacy impact assessment conducted by Statistics Canada found these processes acceptable to reduce the risk of privacy intrusiveness and to minimize the risk of disclosure.
Is there a cost to use SDLE services?
Statistics Canada makes custom services, such as the SDLE, available to Canadian organizations on a cost-recovery basis. Cost-recovery means that clients pay for the direct and indirect cost of doing the work. Custom services are not funded by the budget that Parliament allocates to Statistics Canada. Costs vary depending on the complexity and the requirements of the proposal.
How much does it cost?
The SDLE is a cost-recovery program. Every project is unique and a range of outputs are available. Costs reflect the requirements of each client and range depending on the complexity of the proposal.
The Citizenship and Immigration Canada (CIC) permanent resident landing file contains approximately 2.75 million records corresponding to all individuals who landed in Canada during the 2003 – 2013 time frame. The information in the data file is derived from the information included on each individual’s landing record and has not been updated since the time of landing. The variables available may be described using the subjects list below. There are many more variables on the data file because grouped variables have been derived from the landing record data values. For example, age in years is reported on the landing record. An additional two variables corresponding to 5 and 15 year age groups have also been added to the data file. Another example is that the country of birth is reported on the landing record, while an additional two variables which categorize that country into a region of the world and an area of the world have been added to the data file.
Reference period
2003 – 2013
Subjects
Age in years, plus 5 year age groups and 15 year age groups
Marital Status
Gender
Mother Tongue
Official Languages Spoken
Date of Landing: year-month-day
Education Level- none, secondary or less, …, doctorate
Years Of Schooling
Country of Birth, plus grouped categories region & area of the world
Country of Citizenship, plus grouped categories region & area of the world
intended destination –CMA, census division & province (or if not available, the last known address)
Immigration category – provided in first, second, third and fourth level groupings of the immigration category hierarchy
Occupation title as listed on the landing record (approximately 9900 categories)
Skill levels (two different hierarchies used) corresponding to occupation title as listed on the landing record
NOC Code (2006 and 2011) derived from occupation title as listed on the landing record
Target population
A person is included in the database only if he or she obtained landed immigrant or permanent resident status in Canada since 2003 and 2013.
Sampling
Data are collected for all units of the target population, therefore no sampling is done.
Behind the data provides simple explanations of concepts found in Statistics Canada's The Daily. For more detailed information on methods, please consult the Standards, data sources and methods section.
By Susie Fortier, Steve Matthews and Guy Gellatly, Statistics Canada
Statistics Canada releases graphical information on trend-cycle movements for several monthly economic indicators. Estimates of the trend-cycle are presented along with the seasonally adjusted data in selected charts in The Daily. The inclusion of trend-cycle information is intended to support the analysis and interpretation of the seasonally adjusted data.
This reference document provides information on trend-cycle data. It outlines basic concepts and definitions and discusses selected issues related to the use and interpretation of trend-cycle estimates. The document includes a specific example using data on monthly retail sales. Detailed information on the computation of the trend-cycle is also provided.
1. What is the trend-cycle of a time series?
Trend-cycle data represent a smoothed version of a seasonally adjusted time series. They provide information on longer-term movements, including changes in direction underlying the series.
The trend-cycle is the combination of two distinct components:
The trend provides information on longer-term movements in the seasonally adjusted data series over several years.
The cycle is a sequence of smoother fluctuations around the longer-term trend in part characterized by alternating periods of expansion and contraction.
Changes in trend-cycle data reflect the influence of factors that condition long-run movements in the economic indicator over time, along with fluctuations in economic activity associated with the business cycle. These two components, the trend and the cycle, are often paired together because of the difficulty involved in estimating them individually.
2. What is the difference between a seasonally adjusted series and its trend-cycle?
A seasonally adjusted data series is a series that has been modified to eliminate the effect of seasonal and calendar influences in order to facilitate comparisons of underlying conditions from period to period. Seasonally adjusted data series can also be defined as the combination of the trend-cycle and the irregular component of a time series.
In much the same way as a seasonally adjusted series represents the raw series with seasonal and calendar effects removed, the trend-cycle estimates represent the seasonally adjusted series with the irregular component removed. As its name suggests, the irregular component is the part of the time series that is not in line with the usual or expected pattern of the series. This irregular component is not part of the trend-cycle, nor is it related to current seasonal factors or calendar effects.
The irregular component of a time series can represent unanticipated economic events or shocks (for example, strikes, disruptions, natural disasters, unseasonable weather, etc.) or can simply arise from noise in the measurement of the unadjusted data. In some cases, this irregular component can make large contributions to the period-to-period movements in a seasonally adjusted time series.
By removing this irregular component from seasonally adjusted data, the trend-cycle data can yield a better picture of longer-term movements in the time series. In this sense, the trend-cycle can be interpreted as a smoothed version of the seasonally adjusted series.
3. What can we learn from trend-cycles?
Trend-cycle data provide information on longer-term movements in a seasonally adjusted time series, including changes in the direction of the data. These smoothed data make it easier to identify periods of positive change (growth) or negative change (decline) in the time series, as the noise of the irregular component has been removed. This allows for a more accurate identification of turning points in the data.
For example, the accompanying graph presents data on monthly retail sales in Canada from July 2010 to July 2015. Two data lines are shown: the seasonally adjusted time series and the trend-cycle estimates. The trend-cycle estimates for the most recent reference months are more subject to revision than the estimates for previous periods, and are presented as a dotted line (see question 5).
While the seasonally adjusted data can be used to examine basic changes in the direction of the time series, it is easier to see the longer term movement in these data from the trend-cycle line. The trend-cycle estimates show that retail sales trended upward at a relatively constant rate during 2010 and 2011, and then slowed in 2012. Growth resumed from late 2012 until mid-2014, before sales trended downward in late 2014. Trend-cycle data for early 2015 indicated a return to growth. Estimates for this most recent period are based on a preliminary estimation of the trend-cycle and should be interpreted with caution as they are subject to revision as noted above.
Figure 1 — Retail sales
Sources: CANSIM tables 080-0020 extracted on October 14, 2015; and trend-cycle computations.
Description for Figure 1
Table 1 — Retail sales
$ billion
Seasonally adjusted
Trend-cycle
*preliminary estimate
July 2010
36.295
36.51
August 2010
36.515
36.64
September 2010
36.633
36.79
October 2010
36.880
36.97
November 2010
37.568
37.15
December 2010
37.393
37.30
January 2011
37.392
37.45
February 2011
37.438
37.55
March 2011
37.617
37.64
April 2011
37.755
37.73
May 2011
37.724
37.81
June 2011
38.228
37.92
July 2011
37.926
38.03
August 2011
37.977
38.18
September 2011
38.182
38.34
October 2011
38.624
38.54
November 2011
38.780
38.74
December 2011
39.088
38.89
January 2012
39.069
38.99
February 2012
38.942
39.02
March 2012
39.179
39.00
April 2012
38.906
38.94
May 2012
38.774
38.90
June 2012
38.798
38.89
July 2012
38.901
38.91
August 2012
38.918
38.96
September 2012
39.083
39.04
October 2012
39.203
39.14
November 2012
39.314
39.22
December 2012
39.041
39.31
January 2013
39.467
39.44
February 2013
39.673
39.56
March 2013
39.731
39.72
April 2013
39.624
39.88
May 2013
40.337
40.06
June 2013
40.078
40.25
July 2013
40.428
40.41
August 2013
40.612
40.54
September 2013
40.802
40.67
October 2013
40.689
40.73
November 2013
40.929
40.80
December 2013
40.627
40.88
January 2014
40.987
41.00
February 2014
41.196
41.19
March 2014
41.196
41.41
April 2014
41.766
41.70
May 2014
41.840
41.98
June 2014
42.591
42.27
July 2014
42.585
42.48
August 2014
42.419
42.59
September 2014
42.799
42.61
October 2014
42.619
42.55
November 2014
42.886
42.43
December 2014
42.124
42.28
January 2015
41.523
42.22
February 2015
42.184
42.30
March 2015
42.585
42.45
April 2015
42.564
42.63*
May 2015
42.937
42.82*
June 2015
43.129
43.00*
July 2015
43.345
43.16*
Trend-cycle data are particularly useful when the irregular component makes large contributions to the month-to-month movements in a seasonally adjusted time series. In these cases, graphical information on the trend-cycle helps to interpret the movements in the seasonally adjusted series.
4. Why are trend-cycle data revised?
Existing estimates of the trend-cycle are revised with each release of new seasonally adjusted data. As new seasonally adjusted data becomes available, the trend-cycle data for previous months can be better estimated. If the trend-cycle data were not revised along with the seasonally adjusted series, the resulting trend-cycle data could contain series breaks, and would likely be inconsistent with the seasonally adjusted series in terms of levels, period-to-period movements, or both. It is necessary to revise the trend-cycle data to maintain their analytical value.
5. Why is the trend-cycle line dotted for the most recent reference months?
The trend-cycle line that is published graphically is dotted in the most recent reference periods, as these periods are more likely to be subject to revisions. This is done to signal that the trend-cycle data in this period is a preliminary estimate, and subject to change as new data becomes available. New data make it possible to more accurately estimate the various components that make up the time series. These revisions can change the location of economic turning points, as well as reverse movements between individual months. These types of revisions are more likely to occur in the most recent reference months.
6. Can the trend-cycle be interpreted as a means of forecasting data for future reference periods?
The trend-cycle should not be viewed as a way to forecast the underlying seasonally adjusted data. These estimates are based solely on the historical values of the seasonally adjusted series and do not take into account any other information that could be used to project data for future reference periods. Furthermore, since the trend-cycle is subject to revision when additional reference periods are added to the series, the shape of the trend-cycle in the most recent reference periods should be viewed as a preliminary estimate.
7. What methods can be used to estimate the trend-cycle series?
There is no unique method that is recommended to estimate the trend-cycle that underlies a time series. A variety of methods have been developed in the literature, ranging from very simple to highly complex. Some methods introduce restrictions on the shape of the trend (for example a linear trend of several years), others are based on explicit models that estimate a trend-cycle component, and others, still, are based on variations of moving averages, where the mean of the data is calculated from successive sub spans or intervals of the data.
Since the trend-cycle can also be interpreted as a smoothed version of the seasonally adjusted series, a straightforward way of estimating the trend-cycle is by averaging the last three or six months of the data. While this may yield additional insight into the long-term movement in the series, some measure of caution is warranted as this approach does not take the place of more formal trend-cycle estimation techniques. It can be shown that indicators of the economic cycle derived from this simplified method tend to shift in time and may be artificially dampened.
8. How does Statistics Canada estimate the trend-cycle series?
Statistics Canada uses a weighted moving average of the data to compute the trend-cycle. This method is based on the Cascade Linear Filter of Dagum and Luati (2008). This weighted average is computed using the previous six months, the current month and (for older estimates) up to six of the subsequent months in the series. In real time, for the most recent reference month in the series, only data for the six previous months and current month are used, as data for subsequent months are not yet known. As these data become available, the trend-cycle estimates will be revised.
This specific weighted moving average method was selected after an empirical analysis of different alternatives. The estimate of the trend-cycle obtained with the selected method exhibits good statistical properties, as it provides smooth results with limited revisions, and has a low incidence of falsely identifying turning points. As well, it is a linear process and will preserve additive relationship in the data. This implies, for example, that the trend-cycle plotted on employment for men and women separately will sum up to the plotted trend-cycle line for both sexes. The method is easy to replicate as the weights used in the calculation of the weighted average are available.
9. How does the trend-cycle method work in a more technical sense?
The trend-cycle is estimated by applying moving averages weighted according to the cascade linear filter to the seasonally adjusted series. In general, the moving average used to calculate the trend-cycle for a specific reference month is a weighted average of up to 13 consecutive months, which are centered on the reference month, where possible.
The following references provide more information on the topic of seasonal adjustment, including trend-cycle estimation.
Dagum, E. B. and Luati, A. 2008. "A Cascade Linear Filter to Reduce Revisions and False Turning Points for Real Time Trend-Cycle Estimation," Econometric Reviews. 28:1-3, 40-59.
Statistics Canada recognizes that data users require access to microdata at the business, household or personal level for analytical and research purposes. To encourage the use of microdata, Statistics Canada offers a wide range of programs and access solutions.
All available access solutions are displayed in the continuum of data access below, which provides an overview of all types of data available at Statistics Canada. Each access solution prioritizes the confidentiality of respondents to ensure no personal or identifiable information is published.
Continuum of data access
Self-serve access solutions, available with minimal restrictions, evolve into secure access solutions, available with security procedures.
Automated data ingestion
Automated data ingestion
A self-serve way to programmatically take away data and reuse it for applications, databases, and analyses.
Access solution
Application program interface (API): Allows data users to access Statistics Canada aggregate data and metadata by connecting directly to our public facing databases. The Statistics Canada web services provide access to the time series made available on Statistics Canada's website in a structured form.
Outcomes or products – data exploration, extractions and as an analytical tool for academic and policy research
Public Use Microdata Files
Public use microdata files
Access solution
Free download for select files
Subscription to Public Use Microdata File (PUMF) Collection: Unlimited access to data and documentation is available through a database with an easy-to-use discoverability tool.
Outcomes or products – data exploration, extractions, and as an analytical tool for academic and policy research
Self-Serve Tabulation tool
Self-serve tabulation tool
Access solution
Subscription to Real Time Remote Access (RTRA): Indirect access to Statistics Canada's microdata files, to produce non-confidential tabulations, via remotely submitted SAS programs. It is suitable for clients primarily looking for descriptive statistics.
Outcomes or products – generating a full range of descriptive statistics that can be used for academic and policy research, training, and policy briefings
Confidential microdata files
Confidential microdata files
Data at the individual or institutional level accessed in a secured environment.
Access solution
Virtual Data Lab (vDL): A secure cloud infrastructure used to store and facilitate access to microdata research projects. The vDL grants qualifying data users a more flexible approach to accessing Statistics Canada microdata. Data users can access their microdata projects from various locations, such as their home or office, depending on the sensitivity of the data.
Virtual Research Data Centre (vRDC): A modern virtual infrastructure that will provide academic data users with secure access to Statistics Canada microdata through a partnership with the Canadian Research Data Centre Network (CRDCN). Qualifying data users will have access to data within secure RDC facilities, as well as from other authorized workspaces (e.g., a home or office). The vRDC is expected to start coming online in 2023.
Location of access
Secure Access Points: Statistics Canada premises (e.g., Research Data Centres), secure rooms, authorized workspaces (e.g., personal residence)
Policy research – answering policy and academic research questions that require the use of advanced analytical methods such as complex multivariate analysis, and modelling
Academic research
Evidence-based policy/decision-making
Outcomes or products
Self-serve access to microdata
Statistics Canada offers Public Use Microdata Files (PUMFs) to institutions and individuals. The files contain non-aggregated data that are carefully modified and reviewed to ensure that no individual or business is directly or indirectly identified. They can be accessed directly through the Data Liberation Initiative (DLI) or the PUMF Collection with a paid subscription. Individual PUMFs can be downloaded from the Statistics Canada website at no cost. Statistics Canada also offers remote access solutions to data users.
The Public Use Microdata Files (PUMF) Collection is a subscription-based service for institutions that require unlimited access to all anonymized and non-aggregated data. This is available through an Electronic File Transfer (EFT) Service and the Rich Data Services (RDS) platform, an Internet Protocol (IP) restricted online database with an easy-to-use interface. Select files are also available free of charge from the Statistics Canada website.
The Data Liberation Initiative (DLI) is a partnership between postsecondary institutions and Statistics Canada that improves access to Canadian data resources, providing faculty and students with unlimited access to numerous public use datasets and geographical files.
Real Time Remote Access (RTRA) is an online tabulation tool that allows subscribers to run SAS programs in real time to extract results from confidential microdata in the form of tables.
Secure access to microdata
Statistics Canada provides secure access to confidential microdata for complex statistical analysis to support research, evidence-based decision making, policy development, program management and public understanding. Data users have direct access to a wide range of anonymized survey, administrative and integrated data.
Organizations can receive accreditation by entering into a memorandum of understanding, a section 10 agreement or an organization access agreement with Statistics Canada. Accredited data users are approved researchers and analysts from organizations that follow the protocols for accessing data in a secure environment.
To access microdata, data users must become deemed employees of Statistics Canada. This includes obtaining security clearance, completing mandatory training, and swearing or affirming the Oath of office and secrecy to Statistics Canada.
All data outputs are vetted for confidentiality by Statistics Canada employees before being released to data users.
Data access for academic data users
Research Data Centres (RDCs) are secure physical environments available to accredited academic researchers to access anonymized and non-aggregated microdata for research purposes. RDCs are on university campuses across Canada and staffed by Statistics Canada employees.
The Virtual Research Data Centre (vRDC) information technology platform is a modern virtual infrastructure that provides academic researchers with secure access to Statistics Canada microdata through a partnership with the Canadian RDC Network. Qualifying data users can access data within secure RDC facilities and from other “authorized workspaces” (e.g., a home or office location). The vRDC will be launching in 2025/2026.
Data access for federal government users
Federal government employees with an approved eligible access agreement can access confidential microdata remotely, in authorized workspaces, via the Virtual Data Lab (VDL) or onsite in the Secure Data Access Centre (SDAC), formerly known as the Federal RDC (FRDC), in Ottawa. Fees for access vary depending on access requirements.
Data access for provincial and territorial government users
Provincial and territorial government employees with an approved project can access confidential microdata remotely, in authorized workspaces, via the VDL. Access fees vary depending on the project.
Data access for non-profit organizations, non-governmental organizations and the private sector
Non-profit organizations, non-governmental organizations and the private sector can access confidential microdata, depending on the eligibility of their project, either remotely in authorized workspaces via the VDL, onsite at the SDAC (formerly the FRDC) in Ottawa, or at a local RDC. Access fees vary depending on the project.
Biospecimens like blood, urine and DNA (deoxyribonucleic acid) samples are collected from consenting participants of the Canadian Health Measures Survey and are accessible only for approved research initiatives that meet ethical standards. The resulting analyses are made available through RDCs. Under no circumstances will personal or identifiable information be published. Datasets of potential interest are available to approved academic and government data users.