Accessibility at Statistics Canada

The Accessible Canada Act (ACA) which came into force in July 2019, aims to create a barrier-free Canada by 2040. To achieve this goal, the ACA mandates regulated entities to develop and publish accessibility plans, establish feedback processes, and report transparently on their progress. As part of this effort, we encourage you to provide feedback to help us build an accessible and barrier-free Canada. You can comment on our accessibility plan and describe any accessibility barriers you have encountered with Statistics Canada Your input is vital to ensuring we make meaningful progress.

Provide feedback

Services and information

Road to Accessibility, 2023-2025

Accessibility plan: Policies, programs, practices, and services that help our organization contribute to the goal of an accessible and barrier-free Canada

Road to Accessibility, 2024 progress report

Results of our activities to improve how our organization contributes to the goal of an accessible and barrier-free Canada

Feedback process

Different ways to provide feedback, how to request alternate formats, what we do with your feedback and how to keep your feedback anonymous

Registered Apprenticeship Information System (RAIS) Guide, 2021

Concepts used by the Registered Apprenticeship Information System (RAIS)

Designated trades

Apprenticeship training and trade qualifications in Canada are governed by the provincial and territorial jurisdictions. These jurisdictions determine the trades, for which, apprenticeship training is made available as well as the trades, for which, certificates are granted. These are referred to as designated trades. The jurisdictions also determine which of the designated trades require certification in order to work unsupervised in the trade. The list of designated trades varies considerably between the jurisdictions. Data from the Registered Apprenticeship Information System (RAIS) include those trades that are designated in at least one province or territory.

Registered apprentices are people who are in a supervised work training program in a designated trade within their provincial or territorial jurisdiction. The apprentice must be registered with the appropriate governing body (usually a Ministry of Education or Labour or a trade specific industry's governing body) in order to complete the training.

Trade Qualifiers or Trade Challengers are people who have worked in a specific trade for an extended period of time, without necessarily having ever been an apprentice, and who have received certification from a jurisdiction, usually done via a skills assessment examination in the trade.

Registrations

The total registrations in apprenticeship programs is the count of any registrations that occurred during the reporting period (from January to December of the calendar year) within one of the 13 jurisdictions (province or territories).

Total registrations = Already registered + New registrations + Reinstatements

  • Already registered - the number of registrations carried forward from the previous calendar year
  • New registrations - new entrants to any apprenticeship program that occurred during the 12 months reporting period
  • Reinstatements - registrations by people who had left an apprenticeship program in a specific trade in a previous year and had returned to the same apprenticeship program during the reporting period
Red Seal and non-Red Seal Programs

The Red Seal Program sets common standards assessing the skills of tradespersons across Canada in specific trades, referred to as the "Red Seal" trades. Tradespersons who meet the Red Seal standards, through examination, receive a Red Seal endorsement on their provincial/territorial trade certificates. The Red Seal endorsement provides recognition that your certificate meets an interprovincial standard that is recognized in each province and territory.

Non-Red Seal trades do not have interprovincial standards. Many of these trades do not have an examination requirement in order to work in the trade.

Certification

The requirements for granting a certificate varies by jurisdiction in Canada. In most instances, an apprentice is issued a certificate if he or she completes requirements such as supervised on-the-job training, technical training, as well as passing one or more examinations. Most trade qualifiers (Challengers), meanwhile, become certified once they pass an examination.

Certification terminology

There are jurisdictional differences in the names of certificates awarded.

They may include:

  • Certificate of Apprenticeship
  • Diploma of Qualification
  • Certificate of Qualification
  • Journeyperson's Certificate
  • Certificat d'aptitude
  • Certificat de compagnon
  • Certificat de compétence
  • Diplôme d'apprentissage

Federal, provincial and territorial changes pertinent to the interpretation of RAIS data

1. Revisions have been made to the Quebec 1991 to 2005 data, which also changed the previous Canada totals.

2. Prior to 1999, Nunavut was part of the Northwest Territories.

3. Starting in 2003, a change occurred in the reporting of Newfoundland and Labrador's information concerning newly registered apprentices and cancellations/suspensions.

4. The British Columbia data have been revised in 2005. This changed the previous Canada totals for 2005.

5. Starting with the 2005 reporting year, Prince Edward Island changed their information system and this may have affected historical comparisons. At the end of 2006, Prince Edward Island made some adjustments and revisions to their database which accounted for the change in the carry-over of registered apprentices for the beginning of 2007. In 2007, an increase in new registrations is, to some extent, related to a demand for skilled workers outside of the province. In 2008, due to technical difficulties during the redesign of their Registered Apprenticeship Information System, Prince Edward Island was not able to report a number of apprentices.

6. In 2006, minor trade code revisions were made to Manitoba.

7. In 2006 and 2007, differences may occur in Ontario related to the carry-over totals of active apprentices between both years. This is a result of the conversion of client data into Ontario's new database system. As a result, a clean-up of inactive clients occurred and this adjusted the active total of registered apprentices and their carry-over into 2007.

8. As of 2008, the portion of total Quebec trade information coming from Emploi-Quebec (EQ) is no longer being provided in aggregated form. The data from the province includes all trades with the exception of the automotive sector.

9. In 2008, Alberta incorrectly included the Industrial warehousing trade with the Partsperson and Partsperson (material) trades and also excluded the Construction Craft Worker trade.

10. In 2008, a distinct feature of the Rig Technician trade is that although individuals may be registered as apprentices in the trade in Ontario, their certificates are granted as trade qualifiers (challengers).

11. In 2008, Alberta reported a large number of discontinued apprentices, which was a result of them implementing a series of cancellations and suspensions of inactive apprentices.

12. In 2008 and 2009, new Quebec legislation affecting the Emploi-Quebec (EQ) sector trade was introduced. This resulted in some changes in the reporting of registered apprenticeship registrations.

13. An adjustment has been made to the Joiner trade in British Columbia, to include the trade in the Interior finishing major trade group, rather than in the previous Carpenter's major trade group.

14. In 2010, the Emploi-Quebec (EQ) data included revised trade programs where some of the trades have been segmented into several levels. This segmentation created possible multiple registrations and completions by a single individual apprentice, where previously only one registration and completion existed for this individual.

15. In 2011, the Electronics technician (Consumer Products) trade was no longer designated as a Red Seal trade.

16. In 2012, the Gasfitter - Class A and Gasfitter - Class B trades were designated as Red Seal trades.

17. In 2013, changes in provincial regulations governing drinking water related trades reported by Emploi-Quebec (EQ), have resulted in program changes, as well as the transferring of responsibility of some of these trades to the Conseil de la Construction du Québec (CCQ).

18. Begining in 2013, Ontario's data is received from two organizations. The registration data continues to be reported by the Ministry of Advanced Education Skills Development (MASED). They are also responsible for issuing Certificates of Apprenticeships upon the completion of technical training and on-the-job hours. The Ontario College of Trades (OCOT) is responsible for reporting data on Certificates of Qualifications, which are issued to apprentices upon the completion of a certification exam. This administrative practice has affected the RAIS data in a number of different ways.

  1. On April 8, 2013, MASED awarded a Certificate of Apprenticeship to approximately 6,000 apprentices who had completed their technical training and on-the-job hours, and had not yet received a Certificate of Qualification.
  2. There are discrepancies in the number of apprentices in Ontario due to differences in how MASED and OCOT define an apprentice. OCOT considers apprentices to be their members, for whom they have received membership applications with payment of annual membership fees. MASED considers apprentices to be individuals for whom they have received signed training agreements. In the MASED registration data, apprentices can have active and inactive statuses, which can also contribute to discrepancies. Inactive apprentices are apprentices with whom MASED have not received information about their progression in their apprenticeship program for more than a certain period of time. Active and inactive apprentices are included in the RAIS data. As such, the RAIS data may include previously registered apprentices, who have since discontinued their apprenticeship program, but have not yet informed MASED that they have discontinued their program.
  3. Beginning in 2013, apprentices who discontinued from apprenticeship programs in the past, but who remained on the database as already registered apprentices began to be removed from MASED records. These removals appear in the RAIS data files in the following years. The clean-up occurred during odd years (2013, 2015, and 2017). After discussion with the Ontario data partners in 2019, it was indicated that the last of these batch discontinuations were completed in 2017. As a result, there will be less of a spike in discontinuations, and more of a normalized trend from here starting in 2018 and onwards. Normal discontinuation figures for the province will be about 5,000 to 7,000 per year.
  4. In 2014 and 2015, apprentices who did not receive their Certificate of Qualification or Certificate of Apprenticeship in the same year were classified as trade qualifiers (Challengers) rather than apprentices. To align the RAIS data with the standard definition of trade qualifier (Challengers), these records were reclassified as apprentices with the release of the 2016 RAIS data. This revision led to a decrease of about 2,600 trade qualifiers (Challengers) in Ontario in both 2014 and 2015 compared to the previously released data.

19. In 2013, a regulatory change came into effect which affects both Ornamental ironworkers and Structural steel erectors under the jurisdiction of the Conseil de la Construction du Québec (CCQ). Workers in these two trades are now considered Ironworkers. Both the 2014 and 2015 reference years were also impacted by these regulatory changes.

20. In 2013, changes were made to the Automotive Service Technician trades in British Columbia. Apprentices no longer have to complete mandatory work-based training hours at each program level before progressing to the next level of technical training. The 2014 reference year was also impacted by these changes.

21. Certificates in the Steamfitter/Pipefitter trade under the Conseil de la Construction du Québec (CCQ), also include Plumbers.

22. Starting in 2013, Building/Construction Metalworker are coded to Metal Workers (other) instead of being included in the 'Other' category.

23. In 2014, the Heavy Equipment Operator (Dozer), Heavy Equipment Operator (Excavator) and Heavy Equipment Operator (Tractor-Loader-Backhoe) trades were designated as Red Seal trades.

24. Trade qualifiers (Challengers) in trades governed by Emploi-Quebec (EQ) represents certificates granted to individuals who received recognition for previously completed training. Emploi-Quebec (EQ) may, for example, recognize training in the case where an individual has a certificate in other provinces, territories, countries, or if the individual received a Diploma of Vocational Studies (DVS) in Quebec. These trade qualifiers (Challengers) also represent certificates granted as part of the regular re-certification process required in certain trades.

25. In March of 2014, there were changes made to the eligibility for the Apprenticeship Training Tax Credit (ATTC) in Ontario. This may have affected registration counts in some trades including those for information technology.

26. Prior to 2014, three welder programs (level A, level B, and level C) were offered in British Columbia. Starting in 2014, these three programs began to be phased out and replaced by a single apprenticeship program for welders. This change will impact registrations and certifications in this trade for the years following 2014.

27. Starting in 2017, changes are being made to the Automotive Service Technician program in British Columbia. The program is being restructured to align with other Canadian jurisdictions Automotive Service Technician Red Seal programs. These changes impacted reinstatement totals for 2017 and will potentially influence registrations counts for years following 2017.

28. In July 2018, Manitoba announced that it will perform a data clean-up every two years, starting with the 2019 reporting year. This clean-up resulted in lower numbers for both registrations and certifications for the 2019 reporting year.

29. In 2013, the structural steel erector trade and locksmith trade merged to become the ironworker worker trade. Transitional measures were put in place for journeypersons in these trades, which ended in July 2018.

30. British Columbia has some broad categories of trades where it is possible to receive a certificate after each level is completed, while other jurisdictions only certify apprentices after completing the final level.

  1. In 2019, the Industry Training Authority (ITA) made a decision to group some of their trades under one general trade. For example, Automotive Service Technician 1, Automotive Service Technician 2, and Automotive Service Technician 3 were combined into Automotive Service Technician.
  2. All the trades under Welder were not consolidated, but a general version of the Welder trade was created in 2019.
  3. Also, some apprenticeships were deactivated for certain trades and replaced by Challenge Pathway only, which is for trade qualifiers. Rig Technician, Petroleum Equipment Service Technician, and Water Well Driller are examples of these trades.

31. Starting December 1st, 2019, British Columbia will no longer offer technical training for the Rig Technician apprenticeship program. The apprentices continuing in this trade were taking their technical training in Alberta; however, Alberta no longer offers technical training for this trade and is in the process of de-designating this apprenticeship. Individuals can still receive a designation in trade by challenging the exam in British Columbia.

32. In 2020, as a result of the pandemic some provinces cancelled or postponed in-class training, exams and apprenticeships throughout 2020. Counts for various indicators might be considered historical lows due to the pandemic in 2020. This created a larger deviation in the data for RAIS 2020 registrations, certifications and discontinuations.

Federal Patents, Licences and Royalties Survey 2021-2022

Information for respondents

This information is collected under the authority of the Statistics Act, Revised Statutes of Canada, 1985, Chapter S-19.

Completion of this questionnaire is a legal requirement under this act.

Survey Objective

This survey collects information that is necessary for monitoring federal patent, royalty and licensing related activities in Canada, and to support the development of science and technology policy. The data collected will be used by federal science policy analysts. Your information may also be used by Statistics Canada for other statistical and research purposes.

Confidentiality

Your answers are confidential. Statistics Canada is prohibited by law from releasing any information it collects which could identify any person, business, or organization, unless consent has been given by the respondent or as permitted by the Statistics Act. Statistics Canada will use the information from this survey for statistical purposes.

Security of emails and faxes

Statistics Canada advises you that there could be a risk of disclosure during the transmission of information by facsimile or e-mail. However upon receipt, Statistics Canada will provide the guaranteed level of protection afforded all information collected under the authority of the Statistics Act.

Data sharing agreement

To reduce response burden and to ensure more uniform statistics, Statistics Canada has entered into an agreement under Section 12 of the Statistics Act with Innovation, Science and Economic Development Canada (ISED) and National Research Council Canada (NRC) for sharing information from this survey. ISED and NRC have agreed to keep the information confidential and use it only for statistical purposes.

Under Section 12, you may refuse to share your information with ISED and NRC by writing a letter of objection to the Chief Statistician, specifying the organizations with which you do not want Statistics Canada to share your data and returning it with the completed questionnaire.

Record linkages

To enhance the data from this survey and to minimize the reporting burden, Statistics Canada may combine it with information from other surveys or from administrative sources.

I hereby authorize Statistics Canada to disclose any or all portions of the data supplied on this questionnaire that could identify this department.

  • Yes
  • No
  • Name of person authorized to sign
  • Official Position
  • Program
  • Department or agency
  • Email address
  • Telephone number
  • Extension

Section 1 - Identifying Intellectual Property (IP)

1.1 Reports and disclosures

Please indicate the number of new instances of Intellectual Property reported or disclosed during the reference year 2021/2022.

Please indicate how many instances of Intellectual Property (not necessarily new) resulted in protection activity by this organization and how many were declined for protection by this organization.

The types of Intellectual Property are defined in the Respondent Guide, Section 4.1.1.

In this question, the number of new IP reports and disclosures and the number of IP reports and disclosures (resulting in protection activity and / or declined for protection) are asked for the following categories:

  • Inventions
  • Copyrightable IP (computer software, databases, educational material, other material)
  • Industrial designs
  • Trademarks
  • Integrated circuit topographies
  • New plant varieties
  • Know-how
  • Other (please specify):

Section 2 - Protecting Intellectual Property (IP)

2.1 Patents

2.1 a) During reference year 2021/2022, how many initiating and follow-on patents were applied for and how many patents were issued with the support of this organization? Initiating patent applications include provisional or first filings.

Follow-on patent applications include any that claim priority from an initiating patent application.

International (for example, Patent Cooperation Treaty applications, PCT) and regional applications (e.g., European Patent Office applications) should be counted as single applications.

In this question, the number of New patent applications (Initiating, Follow-on, and Total) and Total patents issued are requested.

2.1 b) Patents held, commercialized and pending

In this question, the Total number are asked of each of the following categories:

  • Total patents held (including patents issued during the reference year)
  • Total patents pending
  • Patents (held or pending) licensed, assigned or otherwise commercialized during the reference year

Section 3 - Licences

3.1 New and active licences

Please report the number of new licences executed during the reference year 2021/2022 and the number of active licences at the end of the reference year 2021/2022. If detailed figures are not available, please report totals in the appropriate cells. Please see the Respondent Guide, Section 4.3.1, for detailed definitions.

In this question, the number of exclusive or sole licence, Non-exclusive or multiple licences, and total are asked of each of the following categories:

  1. New licences executed with Canadian licensees
  2. New licences executed with foreign licensees
    Total new licence (a + b)
  3. Active licences executed with Canadian licensees
  4. Active licences executed with foreign licensees
    Total active licences (c + d)

3.2 Income received from IP

Please specify the nature of the income received during the reference year 2021/2022 from IP commercialization.

In this question, Income received from IP commercialization (in thousands of Canadian dollars) are asked for the following:

  • Running royalties and milestones payments
  • One-time sale of IP (in exchange for a single payment or several payments)
  • Reimbursement of patent, legal and related costs
  • Licence income received from another Canadian institution under a revenue sharing agreement
  • Other (please specify):
  • Other (please specify):
  • Total income received from IP commercialization

Section 4 - Respondent Guide

This questionnaire, in general, covers the intellectual property generated from R&D activities. We acknowledge that commercializable IP arises from other activities as well and that it may be difficult to differentiate. Whenever possible, please report figures for IP generated from R&D activities. If this is not possible, please note that the figures include IP generated from non-R&D activities.

If exact numbers are not readily available, please provide estimates with a note indicating this.

Please do not leave any question blank. Enter zero responses with the digit «0» if the value is known to be zero. If the data are not available, enter «N/A». In cases where the question is not applicable, please indicate this.

Report all dollar amounts in thousands of Canadian dollars.

Notes on survey questions

1.1 Identifying IP – Reports and disclosures:

  • Invention: Includes any new and useful art, process, machine, manufacture or composition of matter, or any new and useful improvement in any art, process, machine, manufacture or composition of matter (Public Servants Inventions Act. R.S., c. P-31, s. 1.). Some inventions are patentable in some jurisdictions but not in others: these include novel genetically-engineered life forms, new microbial life forms, methods of medical treatment and computer software.
  • Copyrightable IP can be broken into the following:
    • Computer software or databases: As noted above, computer software can be patented but normally it is protected by copyright. Databases may also be copyrighted.
    • Educational materials: This category includes special materials that may be copyrighted but are not necessarily in the form of printed books. This could include broadcast lessons, Internet pages, booklets, posters or computer files, among others.
    • Other material: This category includes any copyrightable works other than computer software and databases and special educational materials such as literary, artistic, dramatic or musical works, books, and papers.
  • Industrial designs: These are original shapes, patterns or ornamentations applied to a manufactured article. Industrial designs are protected by registration with the Canadian Intellectual Property Office.
  • Trademarks: These are words, symbols, designs, or combinations thereof used to distinguish your wares or services from someone else's. Trademarks are registered with the Canadian Intellectual Property Office.
  • Integrated circuit topographies: This is a three-dimensional configuration of the electronic circuits used in microchips and semiconductor chips. Integrated circuit topographies can be protected by registration with the Canadian Intellectual Property Office.
  • New plant varieties: Certain plant varieties that are new, different, uniform and stable may be protected by registration with the Plant Breeders' Rights Office, Canadian Food Inspection Agency.
  • Know-how: This is practical knowledge, technique or expertise. For example, certain information is codified in the patent application but a researcher's know-how could be valuable for commercial optimization of the product. Know-how can be licensed independently of the terms of a related patent.

2.1 Patents:

  • Initiating patent applications include provisional or first filings.
  • Follow-on patent applications include any that claim priority from an initiating patent application.
  • Patents pending: A label sometimes affixed to new products informing others that the inventor has applied for a patent and that legal protection from infringement (including retroactive rights) may be forthcoming.

3.1 New and active licences:

  • "New licences executed" refers to the completion of an agreement with a client to use the institution's intellectual property for a fee or other consideration (such as equity in the company).
  • "Exclusive or Sole licences" refers to agreements allowing only one client the right to use the intellectual property.
  • "Exclusive licence" refers to one granted that is exclusive for a territory, for a field of use worldwide or otherwise. Hence, there may be multiple exclusive licences for a single patent.

3.2 Income received is in thousands of Canadian dollars:

  • Running royalties are those based on the sale of products.
  • Milestone payments are those made by a licensee at predetermined points in the commercialization process.
  • One time sales of IP includes income from assignments to commercial exploiters.
  • Other income received from IP: For example, if a potential licensee contributes the funds to apply for the patent, this could be considered another source of income. Please list all items whether or not figures are available.

Contact Person

Name of the contact person who completed this questionnaire:

  • First name
  • Last name
  • Title
  • Email address
  • Telephone number
  • Extension
  • Fax number

How long did you spend collecting the data and completing the questionnaire?

  • hour(s)
  • minutes

Comments

We invite your comments below.

If necessary, please attach a separate sheet.

Please be assured that we review all comments with the intent of improving the survey.

Thank you for completing this questionnaire.

Survey on Sexual Misconduct in the Canadian Armed Forces

Date: September 2022

Program manager: Director, Centre for Social Data Integration and Development Director General, Social Data Insights, Integration and Innovation Branch

Reference to Personal Information Bank (PIB):

Personal information collected through the Survey on Sexual Misconduct in the Canadian Armed Forces is described in Statistics Canada's "Special Surveys" Personal Information Bank. The Personal Information Bank refers to information collected through Statistics Canada's ad hoc surveys which are conducted on behalf of other government departments, under the authority of the Statistics Act. "Special surveys" covers a variety of socio-economic topics including health, housing, labour market, education and literacy, as well as demographic data.

The "Special Surveys" Personal Information Bank (Bank number: StatCan PPU 026) is published on the Statistics Canada website under the latest Info Source chapter.

Description of statistical activity:

Statistics Canada will be conducting the Survey on Sexual Misconduct in the Canadian Armed Forces, on a cost-recovery basis on behalf of the Department of National Defence. The survey will provide insight on sexual assault, sexualized and discriminatory behaviours, and knowledge and perceptions of policies and responses to sexual misconduct. This will be the third collection cycle for the Department of National Defence on this topic; the survey is collected every two years, with the previous two cycles being 2016 and 2018 (the 2020 collection was postponed due to COVID-19).

The survey content includes questions on witnessing and experiencing inappropriate sexual behaviours, discrimination based on sex, sexual orientation, or gender identity, and incidences of sexual assault. It also includes questions about the characteristics of sexual misconduct behaviours and incidences, their impact and reporting of these experiences. Additionally, it contains questions on the age, sex at birth, gender identity, visible minority, Indigenous status, and disability of the respondent. The survey includes specific questions about military members and reservists and their rank over the past 12 months leading up to collection.

This data will be collected from all Regular Force members (approximately 56,000 members, with some exclusions) and members of the Primary Reserve (approximately 27,000) using an employee list provided by the Department of National Defence. This survey is conducted under the authority of the Statistics Act and the response rate is expected to be 30%. Although this collection is being performed for the Department of National Defence, there is no data sharing agreement nor any intent or plan to share any microdata from this survey with them; only aggregate results will be reported. As with previous cycles, SSMCAF 2022 is requesting an exemption from the Directive of Informing Survey Respondents (ISR) to remove the general statement related to data linkage.

Reason for supplement:

While the Generic Privacy Impact Assessment (PIA) addresses most of the privacy and security risks related to statistical activities conducted by Statistics Canada and applied to the two previous cycles of the survey (2016 & 2018), this supplement describes the measures (see below, Mitigation Factors) being implemented for collection and access to the information for this cycle due to the sensitivity of the questions asked and the public scrutiny surrounding sexual misconduct in the Canadian Armed Forces following the release of the Independent External Comprehensive Review on the Department of National Defence and the Canadian Armed Forces in May 2022 highlighting deficiencies around the management of sexual misconduct. This supplement also presents an analysis of the necessity and proportionality of this new collection of personal information.

Necessity and Proportionality

The collection and use of personal information for the Mental Health and Access to Care Survey can be justified against Statistics Canada's Necessity and Proportionality Framework:

  1. Necessity:

    The Survey on Sexual Misconduct in the Canadian Armed Forces will support the Department of National Defence's continued efforts to address and prevent sexual misconduct in its workplace and amongst its workforce. The content of the survey, including the personal information being requested, was deemed necessary for understanding, and, ultimately, preventing and addressing experiences of inappropriate sexual behaviours. Research suggests the risk of experiencing sexual harassment and victimization varies according to a number of factors, many of which require the collection of personal information, such as age. Gathering non-identifiable data would not enable the identification of these risk factors and would result in potentially ineffective interventions.

    Research on sexual misconduct has identified certain risk factors such as gender, education, income, visible minority status, disability status and marital status. The data will be analyzed according to these factors to determine if they are also associated with an increased risk of sexual harassment and victimization in the workplace specifically.

    This work has become even more necessary in light of the publication of the Independent External Comprehensive Review on the Department of National Defence and the Canadian Armed Forces released in May 2022 highlighting deficiencies around the management of sexual misconduct. Notably, this report also highlighted privacy concerns around the Department of Defence's own sexual misconduct tracking and analysis system, further justifying the need for Statistics Canada, Canada's foremost statistical expert, to collect and analyze data independently.

  2. Effectiveness - Working assumptions:

    Conducting surveys is the only way to obtain estimates of both reported and unreported sexual misconduct. This is required in order to fully understand the scope of sexual misconduct in the workplace and to put in place preventative measures. This high quality, timely and relevant data will help inform workplace codes of conduct, as well as other policies, laws and programs designed to prevent and respond to sexual misconduct in the workplace. The survey is a census of individuals working for the Canadian Armed Forces. The expected benefit of the project will be proportional to the quality of the data.

    Other surveys of a similar nature have been carried out by Statistics Canada, such as:

    • Survey of Sexual Misconduct at Work (SSMW) (PIA);
    • Survey of Safety in Public and Private Spaces (SSPPS);
    • Survey of Individual Safety in the Postsecondary Student Population (SISPSP) (PIA);
    • General Social Survey (GSS) on Victimization, 1999, 2004, 2014, 2019; and,
    • General Statistics Survey (GSS) at Work and Home.
    These surveys provide valuable insights and are also used to study the prevalence of sexual harassment over time.
  3. Proportionality:
    Proportionality has been considered based on the following elements – sensitivity and ethics:
    • Sensitivity: The Survey on Sexual Misconduct in the Canadian Armed Forces is a voluntary survey, and the collection method is similar to other voluntary household surveys. Due to the fact that this information is submitted voluntarily, the risk related to the high sensitivity of this data collection method is considered low. However, the nature of the questions in this survey are of a more sensitive nature. As such, additional mitigation factors (see below) are being implemented to ensure that the collection methods are proportional to the needs for the data.
    • Ethics: The Survey on Sexual Misconduct in the Canadian Armed Forces has been developed using past, similar surveys as precedents to determining best practices, in particular to assist victims in accessing support and to reduce response burden. Additional steps are being taken to reduce burden and assist the Survey on Sexual Misconduct in the Canadian Armed Forces respondents (see below, Mitigation Factors).

    Data collected through the Survey on Sexual Misconduct in the Canadian Armed Forces will contain only the variables required to achieve the statistical goals of the survey. The public benefits of the survey findings, which are expected to inform policies, programs and support services aimed at improving workplace culture and work-related settings, are believed to be proportional to the potential privacy intrusion for this voluntary survey. The results will be used to inform policies and training to promote culture change and future support services for those affected by sexual misconduct.

  4. Alternatives:
    Few sources have gathered data on self-reported sexual victimization in the workplace. In 2016, the General Social Survey provided some insight on sexual harassment in a survey focused on the larger topic of Canadians at work and home. In 2017, Insights West, a market research firm surveyed women exclusively on whether and how often they experience sexual harassment at work. That same year, Employment and Social Development Canada surveyed 1,000 people and held public consultations to better understand the types of harassment behaviours that take place in Canadian workplaces. However, no other quality sources report comprehensive and in-depth information such as the characteristics, impact and reporting of these incidents or the industries and settings in which they occur. Furthermore, existing crime data available from administrative data sources are limited to officially reported events that meet the threshold for criminality and are known to significantly underrepresent true rates of sexual victimization in the population. As such, data gaps exist and more information is needed in order to help guide policies, laws, programs and support services that prevent and respond to these behaviours in the workplace. Additionally, considering the potential bias in the Department of National Defence's own reporting and analysis system, no viable data alternatives exist that could provide such information on the Canadian Armed Forces population specifically. Finally, despite previous cycles of the Survey on Sexual Misconduct in the Canadian Armed Forces providing similar insight, the issue persists, necessitating this regular data collection; it will also provide more up to date information than the previous 2018 cycle, as regular collections allow for time-series analyses which may provide even greater insight in the form of trends and comparisons.

Mitigation factors:

This content has undergone in-person testing, including a voluntary round of sensitivity testing to identify and address potential sources of harm for future respondents. As expected, some questions were considered sensitive by the test respondents but the overall risk of harm to survey participants was deemed manageable through the mitigating actions outlined here.

Consent

All respondents will be informed that their participation is voluntary before being asked any questions.

Access to personal information

Statistics Canada has established that answers collected from survey respondents will not be disclosed to the Department of National Defence or Canadian Armed Forces members. As with previous cycles, the master files for analysis will be placed in Research Data Centres (where all data sets have been stripped of personal details such as names, addresses and phone numbers that could be used to identify particular individuals), with additional clear restrictions preventing employees of the Department of National Defence or members of the Canadian Armed Forces from accessing. Furthermore, all results from analysis conducted at Research Data Centres is vetted by Statistics Canada, thus ensuring confidentiality of the survey respondents from their employer.

Support Services

Since survey questions may evoke emotional reactions from the respondents, contact information for support services and resources for victims of sexual violence will be made available to respondents in various forms, including in material communicated in their workplace, material included on the survey questionnaire and on the Statistics Canada website.

Feedback

At the end of the survey questionnaire, we have included an open question to understand the experience and impact that the survey had on respondents. We hope to be able to draw the same conclusions that other surveys on the topic have made: that although this topic is a difficult one, respondents appreciate being heard, feel valued and believe there are benefits to the survey.

Conclusion:

This assessment concludes that, with the existing Statistics Canada safeguards, any remaining risks are such that Statistics Canada is prepared to accept and manage the risk.

Instruction in the Minority Official Language – 2021 Census promotional material

Help spread the word about 2021 Census data on instruction in an official language minority in Canada. These data were released on November 30, 2022.

Quick facts

  • The 2021 Census of Population provides new data on the children eligible for instruction in the minority official language at the primary and secondary levels, based on the three criteria established by the Canadian Charter of Rights and Freedoms.
  • In 2021, 897,000 children were eligible for instruction in the minority official language at the primary and secondary levels, namely in English in Quebec (304,000) and in French in Canada outside Quebec (593,000).
  • Among the provinces and territories in Canada outside Quebec, Ontario (350,000), Alberta (67,000), British Columbia (56,000), New Brunswick (49,000) and Manitoba (30,000) had the highest population of children eligible for instruction in French.
  • Among the provinces and territories, New Brunswick (36.0%), Quebec (18.1%), Yukon (14.1%) and Ontario (12.6%) had the largest proportions of children eligible for instruction in the minority official language. About 1 in 10 children (10.5%) were eligible for instruction in French in Canada outside Quebec.
  • Across Canada, over 90% of eligible children were living within 15 kilometres of a minority official language school.
  • In Canada outside Quebec, 292,000 school-aged children attended a regular French program at a primary or secondary French-language school in Canada, representing 64.7% of eligible children aged 5 to 17. This proportion was higher in New Brunswick (80.6%) and Yukon (71.0%), but lower in British Columbia (55.7%), Newfoundland and Labrador (54.2%) and Alberta (49.6%). In Quebec, 175,000 school-aged children attended an English primary or secondary school in Canada, representing 76.2% of eligible children aged 5 to 17 in this province.
  • The new data on language of instruction show that, among persons in Canada outside Quebec aged 5 years and older, almost 1.2 million studied in a regular French program in a French-language school, 1.6 million in a French immersion program, and 137,000 in both types of programs.
  • Nearly 1 million people aged 5 and older living in Quebec at the time of the 2021 Census studied at an English primary or secondary school in Canada.

Resources

Social media content

Statistics Canada encourages our community supporters to share our content and images to their own social media accounts. You can save the images to your device and copy and paste the text content to your social media platforms.

Post 1

Almost one in eight children in Canada was admissible for instruction in the official minority language in 2021. Check out the new #2021Census data on the topic here:

bit.ly/3VR0byb

Download image for post 1

Post 1

New data from the #2021Census reveal an updated portrait on instruction in the official minority language in Canada, not only for children, but also for adults.

For more info:

bit.ly/3VR0byb

Download image for post 2

Web images

Official Language Tile (JPG, 103 KB)

Terms of use

See the terms of use for information on the approved use of official wordmarks, identifiers and content.

Date modified:

Labour and Language of Work – 2021 Census promotional material

Help spread the word about 2021 Census data on labour and language of work in Canada. These data were released on November 30, 2022.

Quick facts

  • In the face of population aging and the COVID-19 pandemic, the number of health care workers increases by over 200,000 in five years to 1.5 million in 2021.
  • The construction industry, with over 1.3 million workers, continues to be an important employer for men, who work mostly as labourers and in skilled trades.
  • Growth in professional, scientific and technical services employment outpaces that of all other industries, with 1.5 million employed in 2021.
  • Four million Canadians are working in sales and service occupations.
  • The participation rate fell from 65.2% in 2016 to 63.7% in 2021 as more baby boomers near or enter retirement age.
  • From 2016 to 2021, a record 1.3 million new immigrants came to Canada seeking opportunities, boosting labour market growth.
  • Recent immigrants in 2021 experienced lower unemployment rates than earlier cohorts.
  • Participation rates increased from 2016 to 2021 for many racialized groups, with notable increases for Korean and West Asian Canadians.
  • Participation rates declined for First Nations people and Inuit as their labour force growth lags behind their population increases.
  • In Canada's biggest cities, employment rates in 2021 are highest among those in Quebec and the Prairies.
  • The information and communication technology sector is a key employer in six Canadian high-tech hubs, and employed more than 600,000 workers nationally in 2021.
  • In May 2021, there were 4.2 million people working at home, up from 1.3 million in 2016.
  • Working at home is most prominent in big cities and among people in professional occupations—with over 5% of teleworkers relocating from where they lived 12 months earlier.
  • Despite a record-high number and share of Canadians speaking a non-official language at home, English and French remained the languages of convergence in workplaces across the country as 98.7% of workers used one of these two languages most often at work. Overall, 77.1% of workers mainly used English at work, 19.9% mainly used French, and 1.7% used English and French equally.

Resources

Social media content

Statistics Canada encourages our community supporters to post our content and images to their own social media accounts. You can save the images to your device and copy and paste the text content to your social media platforms to share.

Post 1

#DYK? Healthcare and social assistance; construction; and professional, scientific and technical services accounted for nearly one third of all employment in Canada in 2021.

To learn more, check out our new #2021Census data:

bit.ly/3gJqpDK

Download image for post 1

Post 1

In 2021, immigrants made up over one-quarter of Canada's core-aged labour force.

For more info from the #2021Census :

bit.ly/3gJqpDK

Download image for post 2

Post 31

While more Canadians than ever speak a non-official language at home, 77.1% of workers mainly used English at work, 19.9 % mainly used French, and 1.7% used both equally.

Learn more from the #2021Census data:

bit.ly/3gJqpDK

Download image for post 3

Web Images

Labour Tile (JPG, 111 KB)

Terms of use

See the terms of use for information on the approved use of official wordmarks, identifiers and content.

Date modified:

Privacy preserving technologies, part three: Private statistical analysis and private text classification based on homomorphic encryption

By: Benjamin Santos and Zachary Zanussi, Statistics Canada

Introduction

What's possible in the realm of the encrypted and what use cases can be captured with homomorphic encryption? The Data Science Network's first article in the privacy preserving series, A Brief Survey of Privacy Preserving Technologies, introduces privacy enhancing technologiesFootnote 1 (PETs) and how they enable analytics while protecting data privacy. The second article in the series, Privacy Preserving Technologies Part Two: Introduction to Homomorphic Encryption, took a deeper look at one of the PETs, more specifically homomorphic encryption (HE). In this article, we describe applications explored by data scientists at Statistics Canada in encrypted computation.

HE is an encryption technique that allows computation on encrypted data as well as several paradigms for secure computing. This technique includes secure outsourced computing, where a data holder allows a third party (perhaps, the cloud) to compute on sensitive data while ensuring that input data is protected. Indeed, if the data holder wants the cloud to compute some (polynomial) function f on their data v, they can encrypt it into a ciphertext, denoted [v], send it safely to the cloud which computes f homomorphically to obtain [f(v)], and forward the result back to the data holder, who can decrypt and view f(v). The cloud has no access to the input, output, or any intermediate data values.

Figure 1: Illustration of a typical HE workflow
Figure 1: Illustration of a typical HE workflow.

An illustration of a typical HE workflow. The data, v, is encrypted, putting it in a locked box [v]. This value is sent to the compute party (the cloud). Gears turn and the input encryption [v] is transformed into the output encryption, [f(v)], as desired. This result is forwarded back to the owner who can take it out of the locked box and view it. The cloud doesn't have access to input, output, or intermediate values.

HE is currently being considered by international groups for standardization. The Government of Canada does not recommend HE or the use of any cryptographic technique before it's standardized. While HE is not yet ready for use on sensitive data, this is a good time to explore its capabilities and potential use cases.

Scanner data

Statistics Canada collects real time data from major retailers for a variety of data products. This data describes the daily transactions performed such as a description of the product sold, the transaction price, and metadata about the retailer. This data is called "scanner data", after the price scanners used to ring a customer through checkout. One use of scanner data is to increase the accuracy of the Consumer Price Index, which measures inflation and the strength of the Canadian dollar. This valuable data source is treated as sensitive data—we respect the privacy of the data and the retailers that provide it.

The first step in processing this data is to classify the product descriptions into an internationally standardized system of product codes known as the North American Product Classification System (NAPCS) Canada 2017 Version 1.0. This hierarchical system of seven-digit codes is used to classify different types of products for analysis. For example, one code may correspond to coffee and related products. Each entry in the scanner data needs to be assigned one of these codes based on the product description given by the retailer. These descriptions, however, are not standardized and may differ widely between different retailers or across different brands of similar products. Thus, the desired task is to convert these product descriptions, which often include abbreviations and acronyms, into their codes.

After they've been classified, the data is grouped based on its NAPCS code and statistics are computed on these groups. This allows us to gain a sense of how much is spent on each type of product across the country, and how this value changes over time.

Figure 2: High level overview of the scanner data workflow with sample data
Figure 2: High level overview of the scanner data workflow with sample data.

High level overview of the scanner data workflow. First, the product descriptions are classified into NAPCS codes. Examples are given: "mochi ice cream bon bons" is assigned NAPCS code 5611121, while "chipotle barbeque sauce" is assigned 5611132. Application 2 is to assign these codes to the descriptions. The product descriptions have a few identifiers and a price value attached. Application 1 is to sort the data by these codes and identifiers, and compute statistics on the price values.

Sample dataset 1
Description ID1 ID2 Value
"mochi ice cream bon bons" 054 78 $5.31
"chipotle barbeque sauce" 201 34 $3.80

Application 2

Sample dataset 2
NAPCS ID1 ID2 Value
5611121 054 78 $5.31
5611132 201 34 $3.80

Application 1

Statistics (total, mean, variance)

Given the data's sensitivity and importance, we've targeted it as a potential area where PETs can preserve our data workflow while maintaining the high level of security required. The two tasks above have, up to now, been performed within Statistics Canada's secure infrastructure, where we can be sure the data is safe at the time of ingestion and throughout its use. In 2019, when we were first investigating PETs within the agency, we decided to experiment using the cloud as a third-party compute resource, secured by HE.

We model the cloud as a semi-honest party, meaning it will follow the protocol we assign it, but it will try to infer whatever it can about the data during the process. This means we need sensitive data to always be encrypted or obscured. As a proof-of-concept, we replaced the scanner data with a synthetic data source, which allows us to conduct experiments without putting the security of the data at risk.

Application 1: Private statistical analysis

Our first task was to perform the latter part of the scanner data workflow – the statistical analysis. We constructed a synthetic version of the scanner data to ensure its privacy. This mock scanner data consisted of thirteen million records, each consisting of a NAPCS code, a transaction price, and some identifiers. This represents about a week's worth of scanner data from a single retailer. The task was to sort the data into lists, encrypt it, forward it to the cloud, and instruct the cloud to compute the statistics. The cloud would then forward us the still-encrypted results, so we could decrypt and use them for further analysis.

Suppose our dataset is sorted into lists of the form v=(v1,,vl). It's relatively straightforward to encrypt each value vi into a ciphertext [vi] and send the list of ciphertexts ([v1],,[vl]) to the cloud. The cloud can use homomorphic addition and multiplication to calculate the total, mean, and variance and return these as ciphertexts to us (we'll see how division is handled for the mean and variance later in this article). We do this for every list, and decrypt and view our data. Simple, right?

The problem with a naïve implementation of this protocol is data expansion. A single CKKSFootnote 2 ciphertext is a pair of polynomials of degree 214 with 240-bit coefficients. All together, it may take 1 MB to store a single record. Over the entire dataset of thirteen million, this becomes 13 TB of data! The solution to this problem is called packing.

Packing

Ciphertexts are big, and we have a many small pieces of data. We can use packing to store an entire list of values into a single ciphertext, and the CKKS scheme allows us to perform Single Instruction Multiple Data (SIMD) type operations on that ciphertext, so we can compute several statistics at once! This ends up being a massive increase in efficiency for many HE tasks, and a clever data packing structure can make the difference between an intractable problem and a practical solution.

Suppose we have a list of l values, v=(v1,v2,,vl). Using CKKS packing, we can pack this entire list into a single ciphertext, denoted by [v]. Now, the operations of homomorphic addition and multiplication occur slot-wise in a SIMD fashion. That is, if u=(u1,u2,,ul) encrypts to [u], then we can compute homomorphic addition to get

[u][v]=[u+v]

where [u+v] is anFootnote 3 encryption of the list (u1+v1,u2+v2,,ul+vl). This homomorphic addition takes as much time to compute as if there was only one value in each ciphertext, so it's clear we can get an appreciable efficiency boost via packing. The downside is that we now must use this vector structure in all of our calculations, but with a little effort, we can figure out how to vectorize relevant calculations to take advantage of packing.

Figure 3: An illustration of packing. The four values can either be encrypted into four separate ciphertexts, or all be packed into one
Figure 3: An illustration of packing. The four values can either be encrypted into four separate ciphertexts, or all be packed into one.

An illustration of packing. Four values, v1,v2,v3,v4, need to be encrypted. In one case, they can all be encrypted into separate ciphertexts, depicted as locked boxes. In another, we can pack all four values into a single box. In the former case, it will take four boxes, which is less efficient to store and to work with. The latter case, packing as many values as possible, is almost always preferred.

Now I know what you are thinking - doesn't packing, which stores a bunch of values within a vector, make it impossible to compute values within a list? That is, if we have v=(v1,v2,,vl), what if I wanted v1+v2? We have access to an operation known as rotation. Rotation takes a ciphertext that is an encryption of (v1,v2,,vl) and turns it into Rot([v]), which is an encryption of (v2,v3,,vl,v1). That is, it shifts all the values left in one slot, sliding the first value into the last slot. So, by computing [v]Rot([v]), we get

(v1+v2,v2+v3,,vl+v1),

and the desired value is in the first slot.

Mathematically, packing is achieved by exploiting the properties of the cleartext, plaintext and ciphertext spaces. Recall that the encryption and decryption functions are maps between the latter two spaces. Packing requires another step called encoding, which encodes a vector of (potentially complex, though in our case, real) values v from the cleartext space into a plaintext polynomial p. While the data within p is not human-readable as-is, it can be decoded into the vector of values by any computer without requiring any keys. The plaintext polynomial p can then be encrypted into the ciphertext [v] and used to compute statistics on scanner data.Footnote 4

Efficient statistical analysis using packing

Getting back to the statistical analysis on scanner data, remember that the problem was that encrypting every value into a ciphertext was too expensive. Packing will allow us to vectorize this process, making its orders of magnitude more efficient in terms of communication and computation.

We can now begin to compute the desired statistics on our list v=(v1,v2,,vl). The first value of interest is the total, Tv=i=1lvi, obtained by summing all the values in the list. After encrypting v into a packed ciphertext [v], we can simply add rotations of the ciphertext [v] to itself until we have a slot with the sum of all the values. In fact, we can do better than this naïve strategy of l rotations and additions- we can do it in log2l steps by rotating one slot first, then by two slots, then four, then eight, and so on until we get the total Tv in a slot.

Next, we want the mean, Mv=Tv/l. To do this, we encrypt the value 1/l into the ciphertext [1/l] and send it along with the list [v]. We can then simply multiply this value by the ciphertext that we got when computing the total. It's a similar story for the variance, Vv=1/li=1l(vi-Mv)2, where we subtract the mean from [v], multiply the result by itself, compute the total again, and then multiply again by the [1/ l] ciphertext.

Let's investigate the savings that packing afforded us. In our case, we had about 13 million data points which separates into 18,000 lists. Assuming that we could pack every list into a single ciphertext, that reduces the size of the encrypted dataset by almost three orders of magnitude. But in reality, the different lists were all different sizes, with some being as large as tens of thousands of entries and others as small as two or three, with the majority falling in the range of hundreds to thousands. Through some clever manipulation, we were able to pack multiple lists into single ciphertexts and run the total, mean, and variance algorithms for them all at once. By using ciphertexts that can pack 8,192 values at once, we were able to reduce the number of ciphertexts to just 2,124. At about 1 MB per ciphertext, this makes the encrypted dataset about two gigabytes (GBs). With the cleartext data taking 84 megabytes (MB), this left us with a data expansion factor of about 25 times. Overall, the encrypted computation took around 19 minutes, which is 30 times longer than unencrypted.

Application 2: Private text classification based on homomorphic encryption

Next, we tackled the machine learning training task. Machine learning training is a notoriously expensive task, so it was unclear whether we'd be able to implement a practical solution.

Next, we tackled the machine learning training task. Machine learning training is a notoriously expensive task, so it was unclear whether we'd be able to implement a practical solution. Recall the first task in the scanner data workflow - the noisy, retailer-dependent product descriptions need to be classified into the NAPCS codes. This is a multiclass text classification task. We created a synthetic dataset from an online repository of product descriptions and tagged them with one of five NAPCS codes.

Running a neural network is basically multiplying a vector past a series of matrices, and training a neural network involves forward passes, which is evaluating training data in the network, as well as backward passes, which is using (stochastic) gradient descent and the chain rule to find the best way to update the model parameters to improve performance. All this boils down to multiplying values by other values, and by having access to homomorphic multiplication, training an encrypted network is possible in theory. In practice, this is hampered by a core limitation of the CKKS scheme: the leveled nature of homomorphic multiplications. We'll discuss this element first, and then explore the different protocol aspects designed to mitigate it.

Ciphertext levels in CKKS

In order to protect your data during encryption, the CKKS scheme adds a small amount of noise to each ciphertext. The downside is that this noise accumulates with consecutive operations and needs to be modulated. CKKS has a built-in mechanism for this, but unfortunately it only allows for a bounded number of operations on a single ciphertext.

Suppose we have two freshly encrypted ciphertexts - [v1] and [v2]. We can homomorphically multiply them to get the ciphertext [v1v2]. The problem is that the noiseFootnote 5 in this resulting ciphertext is much larger than in the freshly encrypted ones, so if we multiplied it by freshly encrypted [v3], the result would be affected by this mismatch.

What would first have to rescale the ciphertext [v1v2]. This is transparently handled by the HE library, but under the hood, the ciphertext is moved to a slightly different space. We say that [v1v2] has been moved down a level, meaning. the ciphertext started out on level L-1, and after rescaling, it is left on level L-1. The value L is determined by the security parameters we chose when we set up the HE scheme.

Now we have [v1v2] which has a normal amount of noise but is on level L -1, and the freshly encrypted [v3] which is still on level L. Unfortunately, we can't perform operations on ciphertexts that are on different levels, so we first have to reduce the level of [v3] to L-1 by modulus switching. Now that both ciphertexts are on the same level, we can finally multiply them as desired. We don't need to rescale the result of additions, but we do for every multiplication.

Figure 4: An illustration of levels
Figure 4: An illustration of levels

An illustration of levels. On the left we can see the level on which each ciphertext resides: from top to bottom, we have levels L, L-1, and L-2. Freshly encrypted v1, v2, and v3 all inhabit level L on top. After multiplying, v1v2 move down to level L-1. If we want to multiply v1v2 by v3, we need to first bring v3 down to level L-1. The resulting product, v1v2v3 lives on level L-2.

This leveled business has two consequences. One, the developer needs to be conscious of the level of the ciphertexts they're using. And two, the ciphertexts will eventually reach level 0 after many consecutive multiplications, at which point it's spent, and we can't perform any more multiplications.

There are a few options for extending computations beyond the number of levels available. The first is a process called bootstrapping, where the ciphertext is homomorphically decrypted and re-encrypted, resulting in a fresh ciphertext. This process can theoretically result in an unbounded number of multiplications. However, the added expense adds a cost to the computation. Alternatively, one can refresh the ciphertexts by returning them to the secret key holder, who can decrypt and re-encrypt them before returning them to the cloud. Sending ciphertexts back and forth adds a communication cost but this is sometimes worth it when there aren't many ciphertexts to send.

Impact of levels on our network structure

We had to consider this fundamental constraint on HE when designing our neural network. The process of training a network involves performing a prediction, evaluating the prediction, and updating the model parameters. This means that every round, or epoch, of training consumes multiplicative levels. We tried to minimize the number of multiplications needed to traverse forward and backward through the network to maximize the number of training rounds available. We'll now describe the network structure and the data encoding strategy.

The network architecture was inspired by the existing solution in production. This amounted to an ensemble model of linear learners. We trained several single layer networks separately, and at prediction time, we had each learner vote on each entry. We chose this approach because it reduced the amount of work required to train each model - less training time meant fewer multiplications.

Each layer in a neural network is a weight matrix of parameters multiplied by data vectors during the forward pass. We can adapt this to HE by encrypting each input vector into a single ciphertext and encrypting each row of the weight matrix into another ciphertext. The forward pass then becomes several vector multiplications, followed by logarithmically many rotations and multiplications to compute the sum of the outputs (recall that matrix multiplication is a series of dot products, which are a component-wise multiplication followed by computing the sum resulting values).

Preprocessing is an important part of any text classification task. Our data were short sentences which often contained acronyms or abbreviations. We chose to use a character n-gram encoding with n equal to three, four, five, and six - "ice cream" was broken into the 3-grams {"ice", "cre", "rea", "eam"}. These n-grams were collected and enumerated over the entire dataset and were used to one-hot encode each entry. A hashing vectorizerFootnote 6 was used to reduce the dimension of the encoded entries.

Similarly, to how we packed multiple lists together in the statistical analysis, we found we could pack together multiple models and train them at once. Using a value N=215 meant we could pack 16,384 values into each ciphertext, so if we hashed our data to 4,096 dimensions, we could fit four models into each ciphertext. This had the added benefit of reducing the number of ciphertexts required to encrypt our dataset by a factor of four. Meaning we could train four models simultaneously.

Our choice of encryption parameters meant we had between 12 and 16 multiplications before we ran out of levels. With a single layer network, the forward pass and backward pass took two multiplications each, leaving us room for three to four epochs before our model ciphertexts were spent. Our ensembles meant we could train several ciphertexts worth of models if desired, meaning we could have as many learners as desired at the cost of additional training time. Carefully modulating which models learned on what data helped us maximize the overall performance of the ensemble.

Our dataset consisted of 40,000 training examples and 10,000 test examples each evenly distributed over our five classes. To train four submodels for six epochs took five hours and resulted in a model that obtained 74% accuracy on the test set. Using the ciphertext refreshing tactic previously described, we can hypothetically train for as many epochs as we'd like, though every refresh adds more communication cost to the processFootnote 7. After training, the cloud sends the encrypted model back to StatCan, and we can run it in the cleartext on data in production. Or we can keep the encrypted model on the cloud and run encrypted model inference when we have new data to classify.

Conclusion

This concludes the Statistics Canada series of applications of HE to scanner data explored to date. HE has a number of other applications which might prove interesting to a national statistics organization such as Private Set Intersection, in which two or more parties jointly compute the intersection of private datasets without sharing them, as well as Privacy Preserving Record Linkage, where parties additionally link, share, and compute on microdata attached to their private datasets.

There's a lot left to explore in the field of PETs and StatCan is working to leverage this new field to protect the privacy of Canadians while still delivering quality information that matters.

Meet the Data Scientist

Register for the Data Science Network's Meet the Data Scientist Presentation

If you have any questions about my article or would like to discuss this further, I invite you to Meet the Data Scientist, an event where authors meet the readers, present their topic and discuss their findings.

Thursday, December 15
2:00 to 3:00 p.m. ET
MS Teams – link will be provided to the registrants by email

Register for the Data Science Network's Meet the Data Scientist Presentation. We hope to see you there!

References

North American Product Classification System (NAPCS) Canada 2017 Version 1.0

Cheon, J. H., Kim, A., Kim, M., & Song, Y. (2016). Homomorphic Encryption for Arithmetic of Approximate Numbers.Cryptology ePrint Archive.

C. Gentry. (2009). A fully homomorphic encryption scheme. PhD thesis, Stanford University: Craig Gentry's PhD Thesis

Zanussi, Z., Santos B., & Molladavoudi S. (2021). Supervised Text Classification with Leveled Homomorphic Encryption. In Proceedings 63rd ISI World Statistics Congress (Vol. 11, p. 16). International Statistical Institute - Statistical Science for a Better World

Date modified:

Quarterly Survey of Financial Statements: Weighted Asset Response Rate - third quarter 2022

Weighted Asset Response Rate
Table summary
This table displays the results of Weighted Asset Response Rate. The information is grouped by Release date (appearing as row headers), 2020, Q2, Q3, and Q4, and 2021, Q1 and Q2 calculated using percentage units of measure (appearing as column headers).
Release date 2021 2022
Q3 Q4 Q1 Q2 Q3
quarterly (percentage)
November 23, 2022 79.0 80.9 76.2 76.1 56.2
August 25, 2022 79.0 80.9 75.0 55.7 ..
May 25, 2022 79.0 77.3 56.7 .. ..
February 23, 2022 75.6 54.2 .. .. ..
November 23, 2021 56.7 .. .. .. ..
.. not available for a specific reference period
Source: Quarterly Survey of Financial Statements (2501)

Amendment to the Employee Wellness Surveys Privacy Impact Assessment (PIA) & Supplement to Statistics Canada's Generic PIA

Statistics Act Employment and Social Development Canada (ESDC) Employee Wellness Survey (EWS)
Privacy Impact Assessment (PIA) Summary

Introduction

This amendment applies to the Employee Wellness Surveys and Pulse Check Surveys PIA (signed by the Chief Statistician on November 5, 2021), and shall also be considered a supplement to Statistics Canada's Generic Privacy Impact Assessment for statistical survey activities as this ESDC EWS will operate under the authority of the Statistics Act on a cost-recovery basis for the client, ESDC, to be administered on employees of ESDC by Statistics Canada.

Objective

An Amendment to the Employee Wellness Surveys and Pulse Check Surveys PIA & Supplement to Statistics Canada's Generic Privacy Impact Assessment – Statistics Act Employment and Social Development Canada (ESDC) Employee Wellness Survey (EWS) was conducted to determine if there were any privacy, confidentiality or security issues with this activity and, if so, to make recommendations for their resolution or mitigation.

Description

The original EWS survey was collected under the authority of the Financial Administration Act (FAA) from Statistics Canada and Statistical Survey Operations employees and was examined in the Employee Wellness Surveys - PIA, whereas this new collection will be conducted under the authority of the Statistics Act on a cost recovery basis for ESDC on their employees. As such, while Statistics Canada's Generic Privacy Impact Assessment (PIA) addresses most of the privacy and security risks related to statistical activities conducted by Statistics Canada, this amendment and supplement is required to describe how the internal HR personal information activity framework that operates under the authority of the FAA (the original EWS) is being modified to collect personal information externally under the authority of the Statistics Act.

  • This ESDC EWS will be administered one time, with the potential for future cycles.
  • One key change is that, unlike in the original EWS analysis, linking activities involving the following PIBs will not be performed for the ESDC EWS:
  • Another change is that for this survey, the sample file will be provided by ESDC, and it will be matched, following collection, to the survey frame that will be built by Statistics Canada from the Incumbent file. The sample file will contain basic personal information for each of their employees (first and last name, email address, first official language and Personal Record Identifier [PRI]). The Incumbent file comes from Treasury Board Secretariat (TBS), and is an extract from the Public Services and Procurement Canada (PSPC) pay system. The Incumbent file is the most comprehensive administrative file available to federal Government of Canada institutions, by nature of its relation to their pay and staffing. Although it contains a great deal of information on employees, their positions, status and pay, only a small number of variables are required and retained from this file for inclusion on the survey frame – which will only be used internally at Statistics Canada for statistical processing purposes (see Section 4 for more detail on the variables taken from the Incumbent file for employees of ESDC).
  • New content has been added to the questionnaire:
    • Questions about organizational unit at a level of granularity which describes where within the ESDC portfolio an employee works down to branch or region (level 4) in order to ensure that the diverse yet distinct work environments found across portfolios and regions is represented and identifiable in the data.
    • Questions under the TBS Personal Information Bank for Employment Equity and Diversity (PSE 918) which include Indigenous Identity, Gender, and Sociodemographic Characteristics.
      • These questions will provide important context allowing to understand unique challenges experienced by unique populations which support the Call to Action on Equity, Diversity, and Inclusion "Nothing about us, without us".
    • A question which asks "Would you say you are: Heterosexual, Lesbian or gay, Bisexual, Or please specify" which provides important information about the unique experiences which may be had by different based on how a respondent identifies.
    • A question which asks "On a scale from 1 to 10, where 1 is "not at all important" and 10 is "critically important", how important is addressing psychological health and safety within ESDC? " in order to determine how much weight employees give particular services or programs.
    • A question which asks "How far along do you think ESDC is in terms of creating and sustaining a psychologically healthy and safe work environment? Use a scale from 1 to 10, where 1 is "Just getting started" and 10 is "Sustaining well established policies/programs/supports" in order go gauge employee perception of how mature ESDC is with their Mental Health strategy implementation.
    • A question which asks "Below is a list of workplace-based services and supports available to help employees cope with challenging situations and issues related to mental health. Please indicate all the services/supports of which you are aware" in order to understand which programs employees are aware of.

Risk area identification and categorization

The risk area identification has changed from the original Employee Wellness Surveys and Pulse Check Surveys (EWSPCS) PIA in the following sub-sections; privacy risk has decreased.

Risk area identification
b) Type of personal information involved and context
Only personal information, with no contextual sensitivities, collected directly from the individual or provided with the consent of the individual for disclosure under an authorized program. (this was "2" for EWSPCS, is "1" for this ESDC Statistics Act collection) 1
g) Technology and privacy
No (specific technology category was "yes" for EWSPCS and is "no" for this ESDC Statistics Act collection)

Conclusion

This assessment of the Amendment to the Employee Wellness Surveys and Pulse Check Surveys PIA & Supplement to Statistics Canada's Generic Privacy Impact Assessment – Statistics Act Employment and Social Development Canada (ESDC) Employee Wellness Survey (EWS) did not identify any privacy risks that cannot be managed using existing safeguards.

Commuting – 2021 Census promotional material

Help spread the word about 2021 Census data on commuting in Canada. These data were released on November 30, 2022.

Quick facts

  • The way Canadians commute was altered in 2021 by the pandemic, with lockdowns to slow the spread of COVID-19 and changes in how and where Canadians worked leading to 2.8 million fewer commuters, compared with five years earlier.
  • The number of Canadians "car commuting" — that is, travelling to work by car, truck or van as a driver or as a passenger—declined by 1.7 million from five years earlier to reach 11 million in May 2021. The drop in car commuting mainly occurred among those working in professional service industries, while the number of front-line workers commuting by car increased.
  • There were 245,000 fewer Canadians making car commutes of at least 60 minutes, compared with May 2016.
  • The number of people usually taking public transit to work fell from 2 million in 2016 to 1 million in May 2021, declining for the first time since the census began collecting commuting data in 1996.
  • With the economy more open and most public health measures related to the pandemic removed, the number of car commuters, at 12.8 million, had exceeded 2016 levels by May 2022. However, the number of public transit commuters, at 1.2 million, remained well below pre-COVID-19 levels.
  • Despite the drop in public transit use, the proportion of Canadians using mass transit or walking or cycling to get to work was higher than that of Americans.
  • While Canadian government investments in walking and bicycle trails continues, nearly 300,000 fewer workers were usually using active transit (walking or bicycling) as a main mode of commuting in May 2021, compared with five years earlier. By May 2022, active transit commuting in the provinces had increased to 941,000 from 788,000 in May 2021, but was still lower than the 1.1 million recorded in 2016.

Resources

Social media content

Statistics Canada encourages our community supporters to share our content and images to their own social media accounts. You can save the images to your device and copy and paste the text content to your social media platforms.

Post 1

New #2021Census data offers important insights on what getting to work in May 2021 meant for diverse groups of Canadians.

Find the newly-released data here:

bit.ly/3EI8HZf

Download image for post 1

Post 1

The number of people usually taking public transit to work fell from 2016 to 2021, declining for the first time since the census began collecting commuting data in 1996.

Learn more:

bit.ly/3EI8HZf

Download image for post 2

Post 31

With the economy more open and most public health measures removed, the number of car commuters had exceeded 2016 levels by May 2022.

Learn more:

bit.ly/3EI8HZf

Download image for post 3

Post 4

From May 2021 to May 2022, active transit commuting had increased as a main mode of commuting, but was still lower than the 1.1 million recorded in 2016.

Learn more:

bit.ly/3EI8HZf

Download image for post 4

Web images

Commuting tile (JPG, 103 KB)

Terms of use

See the terms of use for information on the approved use of official wordmarks, identifiers and content.

Date modified: