Retail Commodity Survey: CVs for Total Sales June 2021

Retail Commodity Survey: CVs for Total Sales June 2021
This table displays the results of Retail Commodity Survey: CVs for Total Sales (June 2021). The information is grouped by NAPCS-CANADA (appearing as row headers), and Month (appearing as column headers).
NAPCS-CANADA Month
202103 202104 202105 202106
Total commodities, retail trade commissions and miscellaneous services 0.66 0.63 0.76 0.63
Retail Services (except commissions) [561] 0.66 0.63 0.75 0.62
Food at retail [56111] 0.61 0.65 0.64 0.58
Soft drinks and alcoholic beverages, at retail [56112] 0.56 0.56 0.63 0.63
Cannabis products, at retail [56113] 0.00 0.00 0.00 0.00
Clothing at retail [56121] 1.30 1.75 1.77 1.50
Footwear at retail [56122] 2.01 1.81 2.22 1.82
Jewellery and watches, luggage and briefcases, at retail [56123] 5.10 6.63 8.17 4.69
Home furniture, furnishings, housewares, appliances and electronics, at retail [56131] 0.83 0.81 0.62 0.68
Sporting and leisure products (except publications, audio and video recordings, and game software), at retail [56141] 2.30 3.06 3.51 2.50
Publications at retail [56142] 8.72 7.33 6.41 7.80
Audio and video recordings, and game software, at retail [56143] 5.43 4.17 4.57 4.73
Motor vehicles at retail [56151] 2.18 1.96 2.68 2.25
Recreational vehicles at retail [56152] 5.44 4.42 5.75 2.61
Motor vehicle parts, accessories and supplies, at retail [56153] 1.86 1.92 2.03 1.79
Automotive and household fuels, at retail [56161] 2.19 2.45 1.84 1.60
Home health products at retail [56171] 2.73 2.33 2.73 2.79
Infant care, personal and beauty products, at retail [56172] 2.37 2.18 1.98 1.92
Hardware, tools, renovation and lawn and garden products, at retail [56181] 1.66 1.87 1.94 1.90
Miscellaneous products at retail [56191] 3.21 2.94 3.18 3.21
Total retail trade commissions and miscellaneous services Footnote 1 1.83 1.74 2.08 2.04

Footnotes

Footnote 1

1. Comprises the following North American Product Classification System (NAPCS): 51411, 51412, 53112, 56211, 57111, 58111, 58121, 58122, 58131, 58141, 72332, 833111, 841, 85131 and 851511.

Return to footnote 1 referrer

Canadian Association of University Business Officers (CAUBO)

Financial Information of Universities – 2020/2021

General information

  • Name of University (or College)
  • Address of preparer
    • Street
    • City
    • Province
    • Postal Code
  • Fiscal year ending: Day Month Year
  • Name and title of preparer
  • Telephone
    • Area code
    • Number
    • Local
  • Fax
    • Area code
    • Number
  • E-mail address
  • Name of Senior Administrative Officer (if different from above)

Instructions

  1. Please read carefully the accompanying Guidelines.
  2. All amounts should be expressed in thousands of dollars ($'000).
  3. In the "Observations and Comments" section, please explain financial data that may not be comparable with the prior year.
  4. Please do not fill in shaded areas. All non-shaded cells should be completed.
    A nil entry should be indicated with a zero.
  5. Please complete and return the Transmittal Letter.

Reserved for Statistics Canada

  • Full-time equivalent
  • Report Status
  • Institution Code: cbeYYIII
  • Comments
Table 1
Income by fund
Table summary
This is an empty data table used by respondents to provide data to Statistics Canada. This table contains no data.
Types of income Funds
General operating Special purpose and trust Sponsored research Ancillary Capital Endowment Total funds
Entities consolidated Entities not consolidated Sub-total
(thousands of dollars)
Government departments and agencies - grants and contracts  
Federal  
1. Social Sciences and Humanities Research Council                  
2. Health Canada                  
3. Natural Sciences and Engineering Research Council                  
4. Canadian Institutes of Health Research (CIHR)                  
5. Canada Foundation for Innovation (CFI)                  
6. Canada Research Chairs                  
7. Other federal (see Table 6)                  
Other  
8. Provincial (see Table 7)                  
9. Municipal                  
10. Other provinces                  
11. Foreign                  
Tuition and other fees  
12. Credit course tuition                  
13. Non-credit tuition                  
14. Other fees                  
Donations, including bequests  
15. Individuals                  
16. Business enterprises                  
17. Not-for-profit organizations                  
Non-government grants and contracts  
18. Individuals                  
19. Business enterprises                  
20. Not-for-profit organizations                  
Investment  
21. Endowment                  
22. Other investment                  
Other  
23. Sale of services and products                  
24. Miscellaneous                  
25. TotalNote 1                  

  Observations and comments

  • Description (Fund and type of income)
  • Comments
Table 2
Expenditures by fund
Table summary
This is an empty data table used by respondents to provide data to Statistics Canada. This table contains no data.
Types of expenditures Funds
General operating Special purpose and trust Sponsored research Ancillary Capital Endowment Total funds
Entities consolidated Entities not consolidated Sub-total
(thousands of dollars)
Academic salaries  
1. Academic ranks                  
2. Other instruction and research                  
3. Other salaries and wages                  
4. Benefits                  
5. Travel                  
6. Library acquisitions                  
7. Printing and duplicating                  
8. Materials and supplies                  
9. Communications                  
10. Other operational expenditures                  
11. Utilities                  
12. Renovations and alterations                  
13. Scholarships, bursaries and prizes                  
14. Externally contracted services                  
15. Professional fees                  
16. Cost of goods sold                  
17. Interest                  
18. Furniture and equipment purchase                  
19. Equipment rental and maintenance                  
20. Internal sales and cost recoveriesNote 1                  
21. Sub-total                  
22. Buildings, land and land improvements                  
23. Lump sum payments                  
24. TotalNote 2                  

Observations and comments

  • Description (Fund and type of expenditure)
  • Comments
Table 3
Statement of changes in net assets by fund
Table summary
This is an empty data table used by respondents to provide data to Statistics Canada. This table contains no data.
Objects Funds
General operating Special purpose and trust Sponsored research Ancillary Capital Endowment Total funds
Entities consolidated Entities not consolidated Sub-total
(thousands of dollars)
1. Net asset balances, beginning of year                  
2. Income (Table 1, line Total)                  
3. Expenditures (Table 2, line Total)                  
4. Prior year adjustments                  
5. Interfund transfersNote 1                  
6. Add: borrowings                  
7. Deduct: principal portion of debt repayments                  
8. Interfund reallocationsNote 1                  
9. Add: capital expenditures                  
10. Deduct: amortization                  
11. Add or deduct: deferred income                  
12. Add or deduct: pension costs and vacation pay accrual                  
13. Add or deduct: future cost of employee benefits                  
14. Add or deduct: related or affilitated entities                  
15. Add or deduct: other (provide details in space below)                  
16. Net asset balances, end of yearNote 2                  
Net asset balances are comprised of:                  
17. Unrestricted net assets                  
18. Investment in capital assets                  
19. Internally restricted net assets                  
20. Externally restricted net assets                  
21. Net asset balances, end of yearNote 2                  

Observations and comments

  • Description (Fund and object)
  • Comments
Table 4
General operating expenditures by function
Table summary
This is an empty data table used by respondents to provide data to Statistics Canada. This table contains no data.
Types of expenditures Functions
Instruction and non-sponsored research Non-credit instruction Library Computing and communications Administration and academic support Student services Physical plant External Relations Total functionsNote 1
(thousands of dollars)
Academic salaries  
1. Academic ranks                  
2. Other instruction and research                  
3. Other salaries and wages                  
4. Benefits                  
5. Travel                  
6. Library acquisitions                  
7. Printing and duplicating                  
8. Materials and supplies                  
9. Communications                  
10. Other operational expenditures                  
11. Utilities                  
12. Renovations and alterations                  
13. Scholarships, bursaries and prizes                  
14. Externally contracted services                  
15. Professional fees                  
16. Cost of goods sold                  
17. Interest                  
18. Furniture and equipment purchase                  
19. Equipment rental and maintenance                  
20. Internal sales and cost recoveries                  
21. Sub-total                  
22. Buildings, land and land improvements                  
23. Lump sum payments                  
24. Total                  

Observations and comments

  • Description (Function and type of expenditure)
  • Comments
Table 5
Affiliation report
Table summary
This is an empty data table used by respondents to provide data to Statistics Canada. This table contains no data.
Code Legal Name of Affiliated Institution Category of Affiliation
Health Research Institute Other Research Institute Affiliated Hospital Other Affiliated Institution Associated Hospital Other Associated Institution Federated Institution Basis of Reporting Amount Included in Annual Return ($'000)
Included Excluded
For columns 1 to 9, indicate with an "x" in the appropriate column.  
Part I: Separate legal entities consolidated  
1                    
2                    
3                    
4                    
5                    
6                    
7                    
8                    
9                    
10                    
For columns 1 to 7, indicate with an "x" in the appropriate column.  
Part II: Separate legal entities not consolidated  
List each separate legal entity over $100,000  
11                    
12                    
13                    
14                    
15                    
16                    
17                    
18                    
19. Total of all other legal entities under $100,000                    
20. TotalNote 1                    

Observations and comments

  • Description (Function and type of expenditure)
  • Comments
Table 6
Other federal government departments and agencies – Grants and contracts
Table summary
This is an empty data table used by respondents to provide data to Statistics Canada. This table contains no data.
Source of grant/contract Funds
General operating Special purpose and trust Sponsored research Ancillary Capital Endowment Total funds
Entities consolidated Entities not consolidated Sub-total
(thousands of dollars)
1. A. Indirect costs of research                  
B. Separately list each department and agency over $100,000:  
2                  
3                  
4                  
5                  
6                  
7                  
8                  
9                  
10                  
11                  
12                  
13                  
14                  
15                  
16                  
17                  
18                  
19                  
20                  
21                  
22                  
23                  
24                  
25. C. Total of all departments and agencies under $100,000                  
26. TotalNote 1                  

Observations and comments

  • Description
  • Comments
Table 7
Provincial government departments and agencies – Grants and contracts
Table summary
This is an empty data table used by respondents to provide data to Statistics Canada. This table contains no data.
Source of grant/contract Funds
General operating Special purpose and trust Sponsored research Ancillary Capital Endowment Total funds
Entities consolidated Entities not consolidated Sub-total
(thousands of dollars)
A. Ministry responsible (total grants and contracts):  
1                  
2. CFI matching funds                  
B. Other (list each department and agency over $100,000):  
3                  
4                  
5                  
6                  
7                  
8                  
9                  
10                  
11                  
12                  
13                  
14                  
15                  
16                  
17                  
18                  
19                  
20                  
21                  
22                  
23                  
24. C. Total of all departments and agencies under $100,000                  
25. TotalNote 1                  

Observations and comments

  • Description
  • Comments

Canadian Quarterly Labour Productivity Accounts — Technical Notes: a revised version - 2021

By Mustapha Kaci and Marc Tanguay

Quarterly estimates of labour productivity growth and related variables were published for the first time on December 20, 2000 for the aggregate business sector and on December 12, 2003 for its major industrial sectors.Footnote 1

The seasonally adjusted statistical series at the aggregate level (total economy, business sector and non-business sector) begins at the first quarter of 1981, while those at the industry level are available only back to the first quarter of 1997. These quarterly estimates are meant to help those who are focused on analysis of the short-term relationship between real output, employment, hours worked and compensation.

Quarterly estimates of labour productivity and related series are published in index form (using a base year consistent with National Accounts) at the aggregate and industry levels.

Hours worked for all jobs

Hours worked represents the total number of hours that a person devotes to work, whether paid or unpaid. Generally, this includes regular and overtime hours, coffee breaks, on-the-job training, as well as time lost due to momentary interruptions in production when the persons involved remain on the job. However, time lost due to strikes or lockouts, to statutory holidays, vacations, as well as illness, maternity or other personal leave are all excluded from the total number of hours worked.

Quarterly estimates of labour input make the distinction between two main categories of jobs:

  • Paid workers jobs, which comprise employee jobs as well as jobs held by owners of an incorporated enterprise.
  • Jobs occupied by self-employed workers which comprise employers of an unincorporated business, unincorporated own-account jobs and unpaid family-related jobs.

The number of hours worked is calculated as the product of the number of jobs times the average hours worked that is collected by the Labour Force Survey (LFS).

The number of jobs in the business sector is obtained residually by subtracting all jobs occupied in non-commercial activities from the number of jobs in the total economy. An estimate of the number of jobs for the overall economy is first produced from LFS estimates for all ten Canadian provincesFootnote 2, to which are added secondary jobs of workers with more than one job. Employees who hold a job but were not at work during the LFS reference week, and have no right to compensation during their absence, are removed from the estimates. Finally, all workers in self-employed jobs who were not at work during the reference week are also excluded.

In the System of National Accounts (SNA), non-commercial activities comprise two main components: the government sector and non-profit institutions servicing households. The number of jobs estimates for the government sector come mainly from the Survey of Employment, Payroll and Hours (SEPH). Estimates for non-profit institutions servicing households mainly encompass social and community services including religious groups, philanthropic foundations, civic, professional and other similar organizations. Employment for non-profit institutions is built from a linkage between the edited PD7Footnote 3 files and SNA's T4 allocation system developed from Business Register information.

Once the number of jobs for the business sector has been derived, the number of hours worked is calculated by multiplying each component of jobs by their respective average hours worked.

At the industry level, all data on average hours worked by industry and by category of worker are taken from LFS. However, the industrial breakdown for the employee jobs are mainly from SEPH. Only data from LFS are used to estimate the employee jobs in agriculture, agricultural services, and fishing and hunting. In the case of the categories of jobs occupied by self-employed workers, the industrial detail is obtained by integrating information from the five-year censuses and the LFS.

Jobs and hours worked estimates by industry are then adjusted to their respective business sector total of jobs and hours worked, obtained residually from the total economy and the non-commercial sector.

Finally, to ensure consistency with the annual data from the labour productivity database, the quarterly indices of labour input are adjusted to their respective annual benchmarks when they become available. A new yearly benchmark becomes available at the business sector level upon the release of the first quarter indices for the business sector, and upon the release of the third quarter indices at the industry level.

Real gross domestic product (GDP) as the measure of output

Quarterly estimates of real value added (or real GDP) used to calculate the productivity in the business sector and its component two-digit industries are built-up using a chained Fisher volume index method.

For the business sector, quarterly estimates of output are derived from chained Fisher volume indexes of GDP at market prices (expenditure-based), sourced from Quarterly Income and Expenditure Accounts. These quarterly estimates of real GDP in the business sector are constructed after removing the value added of the government sector, non-profit institutions, and the rental value of owner-occupied dwellings. Value added related to paid employees of private household employees is also removed. This approach is similar to that used for the quarterly measures of productivity in the United States.

Corresponding exclusions are also made for labour compensation and hours worked, in order to make output and the labour statistics consistent with one another. In 2019, nominal GDP in the business sector accounted for roughly 73.5% of the Canadian economy.

Since October 1st, 2012, the output series reflect the capitalization of research and development activities and military weapons systems introduced by the Canadian System of National Economic Accounts. This change brought Canada in line with the United States, thereby improving the comparability of the quarterly measures of productivity with those published by the U.S. Bureau of Labor Statistics.

At the industry level, quarterly estimates of output are obtained from the estimates of value added at basic prices, published by the Industry Accounts Division. The chained Fisher volume index is used in years for which final supply and use tables are available. For the most current years without these annual benchmarks, real value added is based on a fixed-weight Laspeyres volume index. It should be noted that quarterly estimates of the value added used to calculate the productivity in the service-producing businesses as well as its component — the real estate, rental and leasing sector — exclude the rental value of owner occupied dwellings as there are no data on the number of hours that homeowners spend on dwelling maintenance services. Private households are also excluded from other business services — the industry grouping to which they would normally be associated.

All quarterly estimates by industry are available for two-digit NAICS industries, the goods-producing business sector, and the service-producing business sector.

It should be noted again that the GDP in the business sector is at market prices but the GDP by industry series is at basic pricesFootnote 4. As the valuation of output in the business sector differs from that used at the industry level, these measures are not directly comparable.

Labour productivity: a measure of real GDP per hour worked

Quarterly estimates of productivity for the total economy, business sector and by industry are based on a Fisher-chained volume index of GDP.

The labour productivity measures relate real output (real GDP) to labour input (hours worked). They estimate the change in the output per hour worked from one period to another. In other words, the growth of labour productivity is meant to estimate the efficiency with which the number of hours worked in all jobs involved in one sector is used in production. Economic performance, as measured by labour productivity, must be interpreted carefully, since these estimates reflect changes in other inputs, in particular the capital, in addition to the efficiency growth of production processes.

As a consequence of the use of different index numbers and of the different valuation of output measures — market prices for the aggregate of the business sector and basic pricesof the major industrial sectors— the aggregation framework of productivity accounts for the business sector as a whole is not entirely consistent with those that are detailed by industrial sector.

Total labour compensation and unit labour cost

Labour compensation measures the value of labour services entering in the production process. This compensation consists of all payments in cash or in kind made by domestic producers to workers for services rendered – in other words, total payroll. It includes the compensation of employees consisting of wages and salaries (including bonuses, gratuities, taxable allowances and retroactive wage payments) and supplementary labour income of paid workers (various contributions to employees), plus an imputed labour income for self-employed workers.

As was the case for estimating jobs, the labour compensation estimates in the business sector are obtained residually by subtracting the wages, salaries and supplementary labour income for the non-business sector from labour compensation for the total economy.

The data on income for all paid jobs in the total economy and at the industry level are taken directly from the estimates of compensation of employees in the quarterly income and expenditure accounts. Compensation of employees for self-employed workers is established by imputation. This imputation is based on relative distance modelling (as observed in 5-year censuses) between compensation rates for self-employed workers and paid employees, and varies from one industry to another.

No compensation of employees is imputed to unpaid family workers since by definition, they get no compensation for their work.

In all sectors, labour compensation is comprised not only of wages and salaries, but also of employer's contributions to indirect benefits (such as the pension and insurances plans). These initial estimates are also obtained from the quarterly income and expenditure accounts, but for productivity measures, an additional industry distribution is carried out.

Compensation per hour worked (or hourly compensation) is the ratio of the total compensation for all jobs to the number of hours worked.

Unit labour cost is the labour cost per unit of output. It is calculated as labour compensation divided by real value added. It is also equal to the ratio of labour compensation per hour worked (hourly compensation) and labour productivity. In other words, it is the joint result of changes in hourly compensation and productivity: unit labour cost increases when labour compensation per hour worked increases more rapidly than labour productivity. It is widely used to measure inflation pressures arising from wage growth.

The unit labour cost in U.S. dollars is equivalent to the ratio of the Canadian unit labour cost to the exchange rate. The latter corresponds to the U.S. dollar value, expressed in Canadian dollars. The exchange rate used is the monthly average exchange rate in Canadian dollars, published by the Bank of Canada.

Relative unit cost is an often-used concept for determining Canadian businesses' competitiveness compared to a foreign competitor. The relative unit cost is defined as the difference between the rate of growth of Canada's unit labour cost and that of a foreign country, with these costs expressed in a common currency for purposes of comparison.

Statistical adjustments

Seasonal Adjustment

Economic time series observed monthly or quarterly often show seasonal patterns that repeat every year during the same month or quarter. Seasonal patterns are changes that occur regularly during a given period of time. They relate to the seasons, sociological patterns and the pace of human activity.

All necessary basic variables for productivity analyses (such as hours worked, jobs, output and compensation) are seasonally adjusted using Statistics Canada's X-12-ARIMA program. Seasonal adjustment consists in removing the combined seasonal and calendar effects from the series, and it therefore helps to highlight the most relevant fluctuations (from an economic point of view). A series that is affected by seasonal fluctuations presents little interest or benefit for economic interpretation since these fluctuations substantially mask cyclical trends.

For information on seasonal adjustment, see Seasonally adjusted data – Frequently asked questions.

Seasonal adjustment is generally made by two main categories of workers (paid workers and unincorporated self-employed workers) at the industry level, and the seasonally-adjusted aggregates of jobs and hours worked are obtained by summation. In the hours worked series for the total economy, the class of paid workers is split between employees and incorporated self-employed, which facilitates the reconciliation with the data published by LFS.

Regression models to adjust for reference week effects and holiday effects on hours worked

The definition of the LFS reference week (usually the week with the 15th day of the month) implies that the actual dates of the week vary from year to year. This variability may impact the month-to-month change in hours worked estimates. In addition, hours worked are affected by variability in the dates of the reference week, combined with the presence of fixed (Thanksgiving, Remembrance Day) or moving (Easter Friday and Easter Monday) holidays. Specifically, in some years, holidays may occur during the reference week, reducing work hours during that week. This variability could introduce significant fluctuations in estimates of hours worked, and it is therefore removed from the series prior to seasonal adjustment.

In order to remove reference week effects and holiday effects, hours worked series are the subject of prior adjustments. These corrections remove the effects attributable to the situations where the 15th of the month falls relatively early or late for reference week and the situations where some holidays fall outside the reference week.

These effects are estimated by the seasonal adjustment method X-12-ARIMA using appropriate regression specifications with ARIMA residuals.

Benchmark adjustment

As a result of using different data sources and methodologies, the annual values (jobs, hours worked, GDP, compensation) and the yearly totals of the independently produced quarterly estimates are not identical. For instance, some components of labour statistics are processed only on an annual basis such as the employment in the three Canadian territories, the employment on Indian reserves, the international flows of workers, etc. However, this difference between the two sets of estimates is eliminated by integrating the annual benchmark values into the quarterly estimates. This integration process, called benchmarking, generates a series which moves as much as possible with the original quarterly series and sums to the annual benchmarks. In other words, this procedure restores coherence between time series data of the same target variable measured at different frequencies (e.g. quarterly and annually).Starting in June 2011, Statistics Canada's in-house SAS Proc Benchmarking program has been used for this purpose. This procedure is available in G-Series production versions v1.04 and v2.0.Footnote 5

Raking procedure used in seasonal adjustment

Seasonally adjusted estimates of overall jobs and hours worked for the business sector are derived by subtracting adjusted estimates for the non-business sector from those of the total economy. The resulting overall estimate is used as a quarterly benchmark for other seasonally adjusted series by industry. For example, hours worked estimates by industry are adjusted independently and then adjusted so that their total sums to the overall quarterly benchmark, while maintaining consistency with the annual detail. This procedure is known as raking. Starting in June 2011, Statistics Canada's in-house SAS Proc TSRaking program has been used for this purpose. This procedure is available in G-Series production versions v1.04 and v2.0.

Revisions to the quarterly series

Statistical revisions are carried out to incorporate the most recent information from quarterly and annual surveys, taxation statistics, public accounts, censuses, etc., as well as from the annual benchmarking process to the supply and use tables.

Quarterly labour productivity estimates and related measures are released four times per year. As shown above, the estimates are produced from various data sources, and they are often revised as a result of the updates to benchmark data, methodologies, and seasonal adjustment.

Data are released within 63-67 days after the reference period. Estimates for each quarter are revised when those for subsequent quarters of the same year are published. At the time of the third quarter of each year, revisions are generally undertaken back three years in conjunction with the National Economic Accounts Quarterly GDP revision process. Benchmarked estimates are not normally revised again except when periodic comprehensive revisions are carried out to incorporate the latest international concepts, classifications, and estimation methods.

Canadian Health Measures Survey - Cycle 6 (2018-2019) Response rates: Activity monitor subsample

Cycle 6 (2018-2019) Response rates
Age group Sex Combined response rate (%) – Activity monitor subsample
ages 3 to 5 Both sexes 31.9
ages 6 to 11 Males 38.1
ages 6 to 11 Females 35.0
ages 12 to 19 Males 24.5
ages 12 to 19 Females 27.0
ages 20 to 39 Males 28.2
ages 20 to 39 Females 30.6
ages 40 to 59 Males 36.3
ages 40 to 59 Females 39.1
ages 60 to 79 Males 32.8
ages 60 to 79 Females 31.9

Canadian Health Measures Survey - Cycle 6 (2018-2019) Data accuracy: Activity monitor subsample

Cycle 6 (2018-2019) Response rates
Average time spent sedentary (minutes per day)
Age group Sex Average c.v.(%)
ages 3 to 5 Both sexes 470 1.5
ages 6 to 11 Males 465 2.1
ages 6 to 11 Females 466 1.4
ages 12 to 17 Males 530 1.4
ages 12 to 17 Females 547 1.5
ages 18 to 39 Males 571 1.4
ages 18 to 39 Females 564 1.5
ages 40 to 59 Males 568 1.3
ages 40 to 59 Females 571 1.0
ages 60 to 79 Males 587 1.4
ages 60 to 79 Females 585 1.0

Privacy Preserving Technologies Part Two: Introduction to Homomorphic Encryption

By Zachary Zanussi, Statistics Canada

Have you ever wished that there was a way to access data to perform analytics while preserving the privacy of the data itself? Homomorphic encryption is an emerging privacy preserving technique with potential applications that will allow for greater access while keeping data encrypted and secure.

The first article in the series, Brief Survey of Privacy Preserving Technologies introduced privacy preserving techniques (PPTs) and how they are poised to enable analytics while protecting the privacy of the data. This article will build on that topic by taking a deeper look at one of these techniques, homomorphic encryption (HE), including what it is, how it works and what it can do for you.

This article begins with an overview of HE and introduces some common use cases. It gives an honest evaluation of HE's advantages and disadvantages. Then it will cover some of the more technical details to prepare you to dig into these techniques yourself! By the end of this article, hopefully you will be inspired to continue your learning by picking an HE library and making your own encrypted circuits.

Homomorphic encryption is currently being considered by international groups for standardization. The Government of Canada does not recommend that HE, or any cryptographic technique, be used in practice before standardization by experts. While HE is not yet ready for use on sensitive data, this is a great time to explore its functionality and potential use cases. Expect a future article on the standardization activities related to HE including expected timelines and schemes.

What is homomorphic encryption?

A traditional encryption scheme maps human-readable plaintexts into masked ciphertexts to protect data from prying eyes. Once masked, these ciphertexts are immutable; changing even a single bit in the ciphertext may return an unrecognizable plaintext message upon decryption. This makes traditional encryption quite static. By contrast, a homomorphic encryption scheme is dynamic; given two ciphertexts, you can perform operations on the underlying plaintexts. For example, a homomorphic 'add' operation will return a ciphertext that, upon decryption, returns the sum of the two original plaintext messages. This allows you to delegate computing to another party so that they can manipulate it without accessing the data.

A typical cloud computing protocol involves a client sending its data to the cloud. Since internet connections are inherently insecure, this transfer is facilitated by a form of transport security protocol that involves encryption, such as HTTPS. Upon receipt, the cloud decrypts and begins computation. However, what if you want to keep the data secret from the cloud? If you encrypted with a homomorphic scheme, not only would the data be protected during transport, but it would also be protected during the entire computation process. Upon completion, the cloud would forward the encrypted results back to the client, who could decrypt and view the results at their leisure.

The term "homomorphic" comes from Greek, roughly translating to "similar form." In mathematics, a homomorphism is a map from one mathematical structure to another that preserves the operations of the first structure. To construct a homomorphic encryption scheme, you need an encryption map that scrambles the data enough that no one can figure out what they are, while simultaneously preserving the structure of the data so that operations on ciphertexts result in predictable results in the plaintexts. These paradoxical goals underscore the difficulty in constructing such a scheme.

Figure 1: An illustration of the benefits of HE.

Figure 1: An illustration of the benefits of HE. On the left is ordinary encryption; to apply the desired analytics, the data need to first be decrypted using the private key. To make the results safe for transport, it must be re-encrypted. In addition, the data are vulnerable for the duration of the computation. On the right is HE; the computing party doesn't require any sensitive information to perform the calculation and the data and results are protected by encryption.

Description - Figure 1

An illustration of the difference between computations with ordinary and homomorphic encryption. In the case of ordinary encryption, the data, a box of lines with a padlock on it, must first be decrypted using some key, resulting in the same box with an unlocked padlock. If the results must be communicated to another party, they must then be encrypted again using another key. In the case of homomorphic encryption, the computation can be performed directly, without any secret information like keys.

What can you do with homomorphic encryption?

There are a number of different computing paradigms that can be enhanced with HE, including delegated computing, data sharing and data release. These different paradigms all revolve around the fact that the data holder, analyst and computing platforms are often different parties entirely and the aim is to reduce or remove the privacy concerns that arise when one of these parties shouldn't have access to the data. It is important to note that HE uses a weaker security model than traditional cryptography and that care will need to be taken to ensure that it is used securely in practice.Footnote 1

Possibly the simplest application involves a data holder delegating their computing to another party, such as the cloud. In this scenario, a client encrypts their data and sends them along with some instructions to the cloud. The cloud can carry out those instructions homomorphically and return the encrypted results, learning nothing about the input, output or intermediate values. These instructions are modeled as circuits, which are sequences of arithmetic operations applied to some input. It should be noted that creating correct and efficient circuits with HE is not always straightforward, but theoretically there is no limit to the computations that can be run. For example, Statistics Canada has completed proof-of-conceptsFootnote 2 applying statistical analysis and neural network training on encrypted data.

As an extension of the delegated computing scenario, consider a case where there are multiple data holders. These data sources want to share their data, but are prevented due to privacy issues. The exact outline depends on the trust model; however, HE may allow these different parties to each encrypt their data and share them with a central authority who has the power to compute homomorphically. These data sharing applications can allow for better analytics in scenarios where data are limited and sheltered. An example is an oncologist who wants to test their hypotheses; patient data are typically restricted to the treating hospitals and combining these sets not only increases the strength of the model, but removes geographic data biases. Therefore, allowing multiple hospitals to share their encrypted data and allowing the oncologist to compute on this joint encrypted dataset allows for better healthcare research and outcomes.

Consider also scenarios with a central data holder and several parties who want to perform analysis on these data. An example of this is Statistics Canada's Research Data Centres, which are hosted across Canada in secure facilities managed by the organization. Accredited researchers can gain special approval to access microdata within these secure sites. While secure, the approval process takes time and the researchers must be able to physically access these sites. With HE, the data centres may be able to host the data encrypted and give access to any party who requests it. This would cut down the administrative costs of adding a new researcher and would broaden access to data in line with Canada's Open Data Initiative.

Figure 2: Illustrations of the three paradigms

Figure 2: Illustrations of the three paradigms. First is delegated computing; the data holder encrypts and sends their data to the cloud, who returns the encrypted results after performing homomorphic calculations. Second, multiple parties encrypt and send their share of a distributed dataset which the cloud can use to perform analytics without compromising the privacy of each data holder. Third, a central data holder can give analysts access to an encrypted dataset. The analysts can be subjected to less scrutiny and restrictions because they never have direct access to the data.

Description - Figure 2

An illustration of the three paradigms. In the "delegated computing" paradigm, the data holder sends their encrypted data to the cloud, who sends the encrypted results back. In the "multiple data holder" paradigm, multiple data holders can each send their encrypted data, allowing the cloud server to perform a joint computation on the union of their datasets, resulting in a stronger analytical result. In the "data bank" paradigm, the cloud holds the data and can send encryptions of it to any analyst they choose, without fear of the data being misused.

HE can help with more than numerical calculations. For example, Private Set Intersection (PSI) allows a client in possession of a sensitive dataset to learn its intersection with a server's dataset without the server learning the client's dataset and without the client learning anything about the server's data beyond the intersection. Private String Matching is a similar protocol that allows the client to query a textual database for a matching substring. Using these and other cryptographic primitives, you can envision a broad privacy-preserving suite linking data dispersed across different government departments and public institutions. While such a system is ambitious and the exact implementations are not yet clear, it gives a taste of the types of systems that you can aspire to as more complicated tasks are completed using HE and other PPTs.

Downsides of homomorphic encryption

While there are many benefits to the use of HE, as with any technology, there are potential downsides. The price of cryptographic security is the computational cost; depending on the analysis, encrypted computation can be several orders of magnitude more expensive than unencrypted. There is also a data expansion cost that can be quite significant. This data expansion cost is exacerbated by the fact that most HE protocols involve transferring encrypted data; while cloud storage is relatively inexpensive, data transfer can be costly and complicated.

There are also a restricted set of computations allowed natively by HE. Only addition, subtraction and multiplication are native to most arithmetic schemes and all other computations (such as exponentials, activation functions, etc.) must be approximated by a polynomial. One should note that this is true in general with all computers, but while a modern computer hides this fact from the user, HE libraries currently require the user to specify how to compute these non-trivial functions.Footnote 3 In some schemes, one also has to be wary of the depth of computations attempted. Indeed, these schemes introduce noise into the encrypted data to protect it. This noise is compounded through successive computations and, unless reduced,Footnote 4 would eventually overtake the signal, at which point decryption will no longer return the expected output. One's choice of encryption parameters is important here. Given a circuit, there exists a parameter set large enough to accommodate it, but dealing with larger parameters increases the computational cost of the protocol.

Can the extra costs in terms of computation and circuit creation be justified? Well, HE allows for computations that might not be possible otherwise. This is true with particularly sensitive datasets, such as health data. There is a huge cost inherent in obtaining permissions for an analyst to work on such data, as well as additional complications such as controlled computing environments. And once the data are shared, how do you verify that the analysts are following the rules? Some data holders may be reluctant to allow anyone access to their data at all; without some additional measures such as HE, this analysis might be impossible. The choice between "expensive computation" and "no computation" is much easier to make.

Moreover, the various schemes and their implementations are an active area of research and the library implementations regularly release improvements to their data compression and homomorphic computation algorithms. There has also been a significant amount of investment in hardware acceleration for HE recently. This is similar to the hardware that is installed on most computers, which contains specific electronic circuits designed to perform encryption and decryption operations as fast as possible. This could allow HE-accelerated cloud computers to perform analysis on encrypted data at speeds closer to that of unencrypted data.

In spite of the downsides, there are reasons to believe that HE will become an important tool for preserving privacy. That makes the present a fantastic time to begin to examine what can be done with these techniques.

The mathematics of homomorphic encryption

Now this article will delve into the inner mathematical workings of HE, including cryptographic details; hopefully even non-mathematical readers will be able to grasp the basics of how these schemes work. It should be noted that the rest of this section provides details pertaining to the scheme of Cheon, Kim, Kim and Song, which they named Homomorphic Encryption for Arithmetic of Approximate Numbers but the cryptographic community usually refers to as CKKS. That said, most of what is mentioned here applies to the other schemes with only slight modifications.

At the heart of every public key cryptosystem is a mathematical problem that is believed to be hard to solve unless you have access to a special piece of information called a secret (or private) key. A related public key can be used to encrypt plaintext data producing a ciphertext, but only knowledge of the secret key enables one to recover the original plaintext from this ciphertext. Since the public key cannot be used to decrypt, the public key can be shared with anyone wishing to encrypt data with confidence that only the secret key holder can decrypt the ciphertext to access the plaintext.

Most HE schemes use some variant of the Learning With Errors hardness assumption. This describes the ring variant, called Ring-Learning With Errors (RLWE). Rather than integers, it deals with polynomials with integer coefficients. More precisely, you want the space of polynomials with integer coefficients modulo q of degree less than N ; this is denoted by R q = Z q [ X ] / X N - 1 . You can think of this space simply as lists of N integers, each less than q . Typically, you would take these values to be quite large; for example N=215=16,384 and q ~ 2800. This makes R q large enough to hide secrets in! Figure 3 gives a toy example of the type of space we would work with.

Figure 3: A toy example of a ring of the type that might be used for HE, as well as a few of its elements.

Figure 3: A toy example of a ring of the type that might be used for HE, as well as a few of its elements. Note that the sum or product of these elements is another element in the ring.

Description - Figure 3

An example of a ring that may be of interest when working with homomorphic encryption.

R17=Z17[X]/X16-1
X15+11X14+X12+5X7+2X6+4X2+X+16
X4+13X3+5X2+X+8
X10+16X8+X6+16X4+X2+16

Here, the value of q is 17 and the value of N is 16. Also listed are some sample polynomials in the ring; one example is the polynomial x 4 + 13 x 3 + 5 x 2 + x + 8 .

Given two polynomials, you can add them or multiply them. The result of these operations is always another polynomial.Footnote 5 This makes R q a kind of a sandbox that you can move around freely within. Mathematicians call a set with this property a ring and the way that these operations affect the elements of the ring is what is meant by structure. The special property of homomorphic encryption is that there exist operations in the ciphertext space that correspond homomorphically to the operations on the underlying plaintext space. The use of polynomial rings is preferred because the operations are efficient and the RLWE problem is believed to be difficult.

How does one hide a secret in a mathematical space? Suppose you have four random polynomialsFootnote 6 in R q , called a , s , e , and b . The RLWE hardness assumption states that it is very hard to distinguish a series of pairs that are either of the form ( a , a s + e ) or of the form ( a , b ). Here, "very hard to distinguish" means "parameters can be set such that all the best computers in the world working together using the best known algorithms would still not be able to solve the problem. The polynomials  a  and  b  can be sampled uniformly at random from all of Rq, but the others have a special form. In CKKS, we take s to have coefficients of ±1  or 0, and sample the coefficients of e from a discrete Gaussian distribution over Zq centred around 0. For the rest of this post, we will just refer to these polynomials as "small", because in both cases their coefficients are close to 0.

The hardness of the RLWE problem allows you to keep a secret in the following way: notice that the first pair is correlated; there is a factor of a in both polynomials, while in the second there is no correlation between the randomly selected a and b. Now imagine someone handed you many pairs that are either all of the form (a,as+e)  for many different values of e and a constant s, or all just completely random pairs. According to the hardness of RLWE, not only could you not reliably find s when given the (a,as+e)  pairs, you couldn't even reliably determine which of type ofthe pairs you were given! Figure 4 gives a toy example of this problem for you to try at home.

Figure 4: Four pairs of polynomials

Figure 4: Four pairs of polynomials inR17=Z17[X]/X16-1 broken into two groups. One group is distributed as form (a,as+e)  for some fixed "small" s and two different random "small" e and the other two are of group is of the form (a,b). Can you tell which is which? What if 17 is changed to 2800 and 16 to 16,384? Now imagine trying to figure out what s is. Note that in the RLWE assumption, you would be given just one of these groups, not both.

Description - Figure 4

Four pairs of polynomials. This is supposed to be a toy example of the RLWE problem for you to try at home. The polynomial pairs are separated into two groups. One group is distributed as (a,as+e)  for a fixed "small" polynomial s, and the other is of the form (a,b) for random a and b. Can you tell which is which? The point of this figure is to illustrate just how hard the RLWE hardness assumption is. The polynomials in the figure are repeated below:

(x4+4x3+10x+1,x8+6x7+x6+8x5+12x4+4x3+10x2+8x+14)
(x4+12x3+2x2+5x+11, x8+14x7+14x6+12x5+9x4+13x3+8x2+6x+7)
(x4+5x3+3x2+8, x8+4x7+12x6+16x5+15x4+3x3+6x2+9x+8)
(x4+9x3+7x2+14x+1, x8+413x7+9x6+14x5+2x4+8x3+x2+13x+12)

The security of schemes based on RLWE follows from the fact that given a , s , and e it is easy to compute a s + e , but it is practically impossible to find s given a and a s + e . You can construct a public key encryption system as follows:

  • Fix your space R q by picking a coefficient modulus q and a polynomial modulus degree N .
  • Pick a random "small" secret key s , a uniformly random a, and a random "small" e to construct your public key (a,-as+e,a). Note the negative in this pair; this makes the encryption process more straightforward but does not affect the security of RLWE.
  • Share your public key with the world and no one will be able to find your secret key! Hence, anyone in possession of this public key can encrypt the data and send them to some party to perform computations on it, homomorphically. In the end, the results also can only be decrypted and viewed using the secret key.

To encrypt the data, the data must first be encoded as a vector v of real numbers. This is straightforward when you are working with numerical data and is a standard practice when working with textual or other types of data. To encrypt, the data vector v is first encoded as a polynomialFootnote 7 m in R q combined with by the public key to get a ciphertext, which will be denoted by [ v ] . Now, send this off to the computing party who will perform homomorphic additions and multiplications to implement the calculation that is of interest. Figure 5 outlines a simple circuit computing a polynomial function. Once the computations are completed and output ciphertexts are returned, you can use your secret key to decrypt and view the results.

Figure 5: A visualization of a homomorphic circuit

Figure 5: A visualization of a homomorphic circuit. A vector of values can be encrypted into a single ciphertext and computed on at once. Pictured is just one realization of a circuit to compute the polynomial f(x). Values with padlocks are encrypted and are thus unreadable to the computing party.

Description - Figure 5

A homomorphic circuit that evaluates the function f ( x ) = x 3 + 4 x 2 + 2 x + 1 on a vector of values. Padlocks represent values that are encrypted and are thus unreadable to the computing party. Arrows and operations represent how one could actually encode the circuit in a homomorphic encryption library.

While this article did not explore all of the details of how these operations are implemented mathematically, the description of HE given so far provides the background needed to further learn about HE.

How to get started with homomorphic encryption

To get started with HE, take a look at some of the available open-source HE libraries; you can try Microsoft SEAL, PALISADE Homomorphic Encryption Software Library, TFHE: Fast Fully Homomorphic Encryption over the Torus, or even Concrete: Open-source Homomorphic Encryption Library if you are a Rustacean also known as someone who uses Rust. These different libraries implement multiple HE schemes between them and you can pick the one that's best for your use case. We reiterate that, until the standardization process has finished, the Government of Canada does not recommend using HE with any sort of sensitive data.

While all of the different HE schemes will implement most use cases, some schemes will perform better on some problems. The CKKS scheme is designed to work on real numbers; if you are interested in statistics or machine learning, you should probably start here! Brakerski/Fan-Vercauteren and Brakerski-Gentry-Vaikuntanathan are great for integer arithmetic and implementing the computer science primitives such as private set intersection or string matching. TFHE implements logical gates natively and refreshes the ciphertext noise with every operation, allowing improved efficiency with longer circuit depths. Readers who are interested are encouraged to try some simple circuits using each scheme and compare the results and performance!

If you would like more information on the cyber security aspects of homomorphic encryption, including standardization activities, contact the Canadian Centre for Cyber Security at contact@cyber.gc.ca, (613) 949-7048 or 1-833-CYBER-88.

Conclusion

This article took an in-depth look at homomorphic encryption, from its applications to the RLWE problem. Next, this series on privacy preserving techniques will look at some proofs-of-concept that have been completed by applying HE at Statistics Canada! It will also cover some of the more advanced aspects of the CKKS interface, including rotations, choice of parameters, packing, bootstrapping, scale and levels.

Want to keep in the loop about these emerging technologies, or want to share your work in the field of privacy? Check out the Privacy Preserving Technologies Community of Practice page (Government of Canada employees only) to discuss this series of privacy articles, connect with peers interested in privacy and share resources and ideas with the community. You can also give feedback on this topic or leave suggestions for future articles in this series.

Note: We wish to acknowledge the input provided on this article by the Canadian Centre for Cyber Security and the Tutte Institute for Mathematics and Computing, both part of Communications Security Establishment.

Date modified:

Federal government expenditures on COVID-19 response measures - Q2 - 2021

On March 11, 2020, the World Health Organization declared the COVID-19 pandemic. To address the consequences of the pandemic on the Canadian economy, the federal government of Canada announced and implemented various support and recovery measures for businesses, households, students, the vulnerable population and organizations helping individuals. The table Federal government expenditures on COVID-19 response measures presents the major federal measures announced and implemented, their treatment in the national accounts (in particular, in the Income and Expenditure Accounts), the table numbers where the pertinent series may be found and the amount of expenditure on a quarterly basis.

For a comprehensive explanations on the treatment of COVID-19 government support measures in the national accounts, please refer to the documents Recording COVID-19 measures in the national account and Recording new COVID measures in the national accounts.

Treatment in national accounts: Subsidies on production, by quarter at quarterly rates
COVID-19 measure 2020 2021
First quarter Second quarter Third quarter Fourth quarter First quarter Second quarter
$ millions
Canada Emergency Wage Subsidy (CEWS) - business 4,359 29,351 22,711 10,703 10,033 7,633
Temporary Wage Subsidy (TWS) - business 169 739        
Canada Emergency Rent Subsidy (CERS) - business     52 1,558 1,714 1,096
Lockdown Support (LS) - business     5 209 341 237
Source: Statistics Canada, tables 36-10-0103, 36-10-0118, 36-10-0477.
Treatment in national accounts: Current transfers to non-profit institutions serving households (NPISH), by quarter at quarterly rates
COVID-19 measure 2020 2021
First quarter Second quarter Third quarter Fourth quarter First quarter Second quarter
$ millions
Canada Emergency Wage Subsidy (CEWS) - NPISH 200 1,095 1,051 573 549 325
Temporary Wage Subsidy (TWS) - NPISH 13 46        
Canada Emergency Rent Subsidy (CERS) - NPISH     1 36 38 22
Lockdown Support (LS) - NPISH     0 4 7 4
Source: Statistics Canada, tables 36-10-0118, 36-10-0477, 36-10-0115.
Treatment in national accounts: Subsidies on products and imports, by quarter at quarterly rates
COVID-19 measure 2020 2021
First quarter Second quarter Third quarter Fourth quarter First quarter
$ millions
Canada Emergency Commercial Rent Assistance (CECRA)   1,130 904    
  • Federal contribution
  849 679    
  • Provincial contribution
  281 225    
Source: Statistics Canada, tables 36-10-0103, 36-10-0118, 36-10-0477.
Treatment in national accounts: Current transfers to households - Employment Insurance benefits, by quarter at quarterly rates
COVID-19 measure 2020 2021
First quarter Second quarter Third quarter Fourth quarter First quarter Second quarter
$ millions
Canada Emergency Response Benefit (CERB) - EI stream   19,127 9,239 864    
Source: Statistics Canada, tables 36-10-0118, 36-10-0477, 36-10-0112.
Treatment in national accounts: Transfers to households -Other federal transfers to households, by quarter at quarterly rates
COVID-19 measure 2020 2021
First quarter Second quarter Third quarter Fourth quarter First quarter Second quarter
$ millions
Canada Emergency Response Benefit (CERB) - CRA stream   29,002 15,597 704    
Canada Emergency Student Benefit (CESB)   1,386 1,550 8  
Canada Recovery Benefit (CRB)       6,073 7,280 6,516
Canada Recovery Caregiving Benefit (CRCB)       900 960 933
Canada Recovery Sickness Benefit (CRSB)       246 144 188
Source: Statistics Canada, tables 36-10-0118, 36-10-0477, 36-10-0112.

Life Expectancy and Deaths Statistics

Life Expectancy and Deaths Statistics

Follow:

Sign up to My StatCan to get updates in real-time.

What are provisional deaths in Canada?

Provisional deaths are not based on all deaths that are observed during a specific reference period because of reporting delays. Provisional death counts are based on what is reported to Statistics Canada by provincial and territorial vital statistics registries.

Provisional death estimates have been adjusted to account for incomplete data. As a result, the provisional death counts and estimates released may not match figures from other sources, such as media reports, or counts and estimates from provincial and territorial health authorities and other agencies.

Visualizing mortality in Canada

Explore the cause of death trends in Canada since 2000 with these interactive dashboards. Metrics visualized on the dashboards are: number of deaths, death rate per 100,000 people, and the proportion of deaths represented by each selected cause of death.

Rates and counts by age group for select causes of death

Visualizing mortality in Canada: Rates and counts by age group for select causes of death

Cause of death trends in Canada broken down by several age groups between 0 to 90 years of age and by sex.

Rates and counts by sex and province or territory for select causes of death

Visualizing mortality in Canada: Rates and counts by sex and province or territory for select causes of death

Cause of death trends in Canada broken down by province or territories and by sex.

Focus on COVID-19

Learn more about provisional deaths and excess mortality:

Our partners

Our partners

Data sources

Data sources

Frequently asked questions

Frequently asked questions

Death certification and classification

Death certification and classification

COVID-19 comorbidities in Canada

COVID-19 comorbidities in Canada

The Daily articles

The Daily articles

COVID-19 insights

COVID-19 insights

What’s trending in health? Visit Statistics Canada’s official release bulletin

Explore the mortality dashboard

The Provisional Deaths in Canada Dashboard allows users to examine recent mortality trends, by comparing the number of deaths being observed with previous years. Comparing provisional death counts and death estimates over time can be useful for understanding trends in mortality. As Canada's population grows and ages, the number of deaths is expected to increase from year to year. The Canadian Vital Statistics Death (CVS-D) database is the authoritative source for cause of death data in Canada. The CVS-D is an administrative survey that collects demographic and medical information from all provincial and territorial vital statistics registries on all deaths in Canada.

Modelling SARS-CoV-2 Dynamics to Forecast PPE Demand

By: Jihoon Choi, Deirdre Hennessy and Joel Barnes, Statistics Canada

Personal protective equipment (PPE) has become an important part of the lives of all Canadians as the pandemic changed the way we interact with one another and protect ourselves. The rapid rise of the novel coronavirus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), also referred to as COVID-19, has put unprecedented demands on the Government of Canada to provide timely, accurate and relevant information to inform decision-making around a host of public health issues, including PPE procurement and deployment of PPE to the provinces and territories.

The global pandemic caused by SARS-CoV-2, poses a serious public health concern for Canadians.Footnote 1 As of October 2021, over 1.71 million diagnosed cases have been reported in Canada, meaning it is essential that Canadians have access to PPE when they need it.

PPE refers to commodities such as masks, gloves and gowns that are worn to provide protection against potential exposure to infectious pathogens. The pandemic has brought severe stress to the supply chains for PPE in Canada, causing a significant disruption in supply among sectors where PPE stocks are essential (e.g., hospitals, long-term care facilities).Footnote 2 For this reason, forecasts on the pandemic trajectory and its effect on the supply, demand and inventory of PPE have become a crucial element in decision-making.Footnote 3, Footnote 4

Epidemiological models can contribute valuable insights in public health decision-making by generating a number of 'what-if' scenarios under different assumptions. Furthermore, it can help estimate how different public health intervention measures can affect the outcome of the epidemic (i.e. deciding the critical timing to introduce lockdowns/reopening in each provinces).Footnote 5 Different variations of epidemiological models exist and many of these are compartmental models where the population is divided into multiple compartments and are moved from one compartment to another at a defined rate.Footnote 6

The Susceptible-Infectious-Recovered (SIR) model is one of the most basic forms of a compartmental model (Figure 1). This model consists of three compartments, where S is the number of susceptible individuals, I is the number of infected individuals and R is the number of recovered (and immune) individuals.

Figure 1 – Structure of a basic epidemiological model

Figure 1 – Structure of a basic epidemiological model.

Description - Figure 1

The base structure of the SIR model. The initial population starts in the susceptible compartment and flows into the infectious compartment at an infection rate β, then moves into the recovered compartment at a recovery rate defined by λ.

Figure 1 shows the base structure of the SIR model. The initial population starts in the susceptible compartment and flows into the infectious compartment at an infection rate β, then moves into the recovered compartment at a recovery rate defined by λ.

The origin of compartmental models in epidemiology dates back to the early 20th century. Specifically, the foundation was built based on the theorem outlined by Ronald Ross, William Hamer, Anderson McKendrick and William Kermack, along with important influences from a statistical perspective by John Brownlee.Footnote 7 Since their development, compartmental models have proven useful in modelling for numerous communicable diseases, such as malaria and plague.Footnote 8, Footnote 9

As the SARS-CoV-2 outbreak became a serious public health concern for Canadians, Health Canada commissioned the Data Science Division (DScD) and the Health Analysis Division (HAD) at Statistics Canada to create an epidemiological model that could forecast the trajectories of the outbreak in Canadian provinces. The forecasted cases and hospitalizations produced from the epidemiological model are used in the PPE Project to estimate the PPE demand in various sectors across the provinces. The PPE Project aims to inform decisions related to procurement, allocation and domestic production investment in PPE through evidence-based reports on the current status and projections of PPE supply and demand, in diverse epidemiological scenarios.

Creating the initial model for PPE demand: Susceptible – Infected – Recovered – Death (SIRD) model

The initial SIRD model first used Bayesian methods to estimate the number of active infections in Canadian communities based on SARS-CoV-2 mortalities. The number of total SARS-CoV-2 infections (diagnosed and undiagnosed) were reverse-estimated from SARS-CoV-2 fatalities by province and territory, using a similar method to that used by Flaxman et al.Footnote 10 Estimated number of infections, deaths and recoveries were fed into a simple compartmental model, composed of four compartments. The first three compartments are equivalent to the base SIR model (Susceptible, Infected and Recovered), but this model has an additional compartment D, which represents the population in deceased state (Figure 2).

Figure 2 – Structure of a SIRD epidemiological model

Figure 2 – Structure of a SIRD epidemiological model.

Description - Figure 2

The base structure of SIRD (Susceptible – Infected – Recovered – Death) model. The initial population starts in the susceptible compartment and flows into the infectious compartment at an infection rate β, then moves into the recovered compartment at a recovery rate defined by λ or into the deceased compartment at a mortality rate defined by γ.

Figure 2 shows the base structure of SIRD (Susceptible – Infected – Recovered – Death) model. The initial population starts in the susceptible compartment and flows into the infectious compartment at an infection rate β, then moves into the recovered compartment at a recovery rate defined by λ or into the deceased compartment at a mortality rate defined by γ.

This model also produced a dynamic historical Reproduction Number, R(t). The R(t) is an important concept in infectious disease epidemiology, providing information about the transmission potential of an infectious agent. In other words, it shows how contagious an infectious disease is at time t in the study population. Generally, if R(t) is greater than 1, the disease will start to propagate in the population, whereas if R(t) is less than 1, the number of new cases will decrease.

R(t) is often estimated from observing the number of new infections across a time period. However, the number of SARS-CoV-2 cases was not traced accurately in the beginning of the pandemic, due to a limitation in resources such as insufficient availability of testing kits.Footnote 11 As a workaround, the SIRD model estimated the historical R(t) from the number of SARS-CoV-2 fatalities, which was a much more reliable measure than actual case counts during the initial periods of the outbreak. An infection fatality rate (IFR) for SARS-CoV-2 from the research literature was used to backwards-compute the historical R(t).

To forecast the future R(t), the team generated different pandemic scenarios each with varying assumptions about public health intervention measures in effect:

  • The SARS-CoV-2 containment scenario—attempts to model a situation where strict public health intervention measures are in place (i.e., lockdowns). Under this scenario, R(t) is always kept under 1.
  • The Resurgence Best Estimate scenario—allows the epidemic to resurge in tandem with the reopening of the economy and allows the R(t) to stay high.
  • The Peaks and Valleys Scenario—allows the epidemic to resurge in tandem with the reopening of the economy until hospital intensive care unit (ICU) occupancy reached 30% of the provincial maximum. Then an intervention plan is triggered to bring the R(t) back down to lockdown level.

The SIRD model was used as the main epidemiological model for the PPE project until the beginning of 2021. The model has shown decent accuracy in forecasting the pandemic during the initial phase of the outbreak. However, this model had a number of limitations. In particular, it did not take age structure of the population into account. These limitations led to the creation of another version of the epidemiological model with additional compartments that can take more complex characteristics of the pandemic into consideration.

The current model: Susceptible – Exposed – Infected – Recovered – Deceased – Vaccinated (SEIRDV) model

Early in the pandemic, DScD and HAD at Statistics Canada worked with the Public Health Agency of Canada (PHAC) to develop an age-structured, multi-compartmental SIR model. This collaboration yielded the SEIRDV model, which was adapted by the Statistics Canada PPE epidemiological team, in collaboration with Health Canada, for use in the main PPE demand and supply model. This model has been used as the main epidemiological model in the PPE project since January 2021 (Figure 3).

Figure 3 – Simplified structure of a SEIRDV epidemiological model

Figure 3 – Simplified structure of a SEIRDV epidemiological model.

Description - Figure 3

A simplified structure of the SEIRDV (Susceptible – Exposed – Infected – Recovered – Death – Vaccinated) model. The population starts in the susceptible compartment, and then can flow into exposed and infectious compartments upon contracting the disease. Individuals whose infections have been detected are sent to the quarantine path and will have a reduced likelihood of spreading the disease to others. Upon infection, individuals with severe symptoms will seek medical attention. The severely symptomatic population can end in two terminal states: deceased or recovered.

Figure 3 shows a simplified structure of the SEIRDV (Susceptible – Exposed – Infected – Recovered – Death – Vaccinated) model. The population starts in the susceptible compartment and then can flow into exposed and infectious compartments upon contracting the disease. Some of these infections are detected from contact-tracing efforts or SARS-CoV-2 testing. Individuals whose infections have been detected are sent to the quarantine path and will have a reduced likelihood of spreading the disease to others. Upon infection, individuals with severe symptoms will seek medical attention. The severely symptomatic population can end in two terminal states: deceased or recovered. People who are only mildly symptomatic or asymptomatic will flow into the recovered compartment over time. In addition, the population can be vaccinated in this model. If an individual is vaccinated, their chances of flowing into the infection compartments are reduced by the protection rate of the vaccine. Similarly, the vaccinated population has a reduced probability of developing severe cases, and therefore, of flowing into the health care system (i.e. Hospital/ICU).

The four major modifications made by introducing the SEIRDV model are:

1. The model allows the study population to be age stratified

In the SEIRDV model, the population is divided into six distinct age groups (0-9 years, 10-19 years, 20-39 years, 40-59 years, 60-74 years, 75+ years), which allows different parameters to be set for each age group and to take age-related differences into account.

For instance, reports show that younger age groups have a reduced likelihood of hospitalization and mortality compared to older age groups.Footnote 12 Since the SEIRDV model allows users to set different flow rates for each age group, it is capable of modelling this effect.

Similarly, certain age groups are known to interact at a higher frequency than others (i.e., parents with their children) and therefore have increased chances of transmitting the disease to each other. In the SEIRDV model, this effect can be taken into account by using an interaction matrix that models the average contact rate between two age groups.

2. Estimation of the transmission rate (β) has been improved

Instead of relying on a single measure, such as R(t), to estimate the transmission rate, the model now uses three different parameters to calculate the rate of transmission.

First is β, which in this model represents the "probability of transmission upon contact". This number is estimated from literature and calibrated in accordance with the dominant strain of SARS-CoV-2 in each province. This measure is multiplied by a contact matrix, which is a numeric matrix that illustrates the average number of contacts that people in each age group make with another age group. Lastly, a contact multiplier is applied to take variances in contact rates into account. When different public health intervention measures are in effect (e.g., lockdowns), the rate of contact among the population will change accordingly. These variances are captured by calibrating the contact-multiplier to the reported number of daily active cases in each province every week.

3. The effect of vaccination is taken into consideration

Two main effects of vaccination are a reduction in the stress on the health care system (by providing protection against developing a severe case requiring hospitalization) and transmission of the disease within the community (by providing protection against infection, ultimately promoting herd immunity). The current design of the SEIRDV model takes this into account by introducing a distinct vaccination pathway. The vaccinated population will flow into this pathway, where they will have reduced chances of contracting the disease as well as reduced likelihood of developing a severe symptom requiring hospitalization.

The model also takes into account the two-dose vaccination plan set out by the National Advisory Committee on Immunization. The vaccination data were retrieved from PHAC and COVID-19 Canada Open Data Working Group (CCODWG) to estimate the number of doses that can be given out each day per province. In addition, the different rates of protection given by the two-stage vaccination plan were modelled by dividing our vaccination path into four distinctive compartments. This process is summarized in Figure 4.

Figure 4 – Design of the vaccination compartment

Figure 4 – Design of the vaccination compartment

Description - Figure 4

Demonstrates how a population is divided into age groups, with vaccines distributed from oldest to youngest with accommodation for some high risk groups of all ages. The groups flow through first and second doses on their way to full vaccination.

The study population is divided into six distinct age groups (0-9 years, 10-19 years, 20-39 years, 40-59 years, 60-74 years, 75+ years) and vaccines are distributed in the order of older to younger age groups, while distributing a small number of doses to an age group that represents the health care professionals in the early phase. Upon receiving the first dose, the freshly vaccinated population flows into the first vaccination compartment which represents the population who have received their vaccine but have not had the chance to develop any immunity yet. Then this population flows into the second vaccination compartment after a set period, at which point they develop a partial protection against SARS-CoV-2. The population stays in this compartment until phase 1 (i.e. giving out first dose) completes. Once phase 2 of the vaccination plan starts, the population flows into the third vaccination compartment where they receive their second dose, then flows into the last vaccination compartment where they develop the maximum immunity that they can gain from the vaccination.

4. Impact of variant of concern (VOC) can be modelled

A number of different strains of SARS-CoV-2 have been sequenced around the world as a result of viral mutation, some having shown higher rates of transmission or mortality.Footnote 13 These variants are called variants of concern (VOC) and became a crucial factor to consider in epidemiological modelling of SARS-CoV-2. The SEIRDV model is capable of modelling these by altering the probability of transmission (β) to model the increased transmission rate, as well as altering the flow into the hospitalization or the deceased compartment to model the effect of increased symptom-severity of the variant. Using this mechanism, the team has successfully modelled the effect of the B.1.1.7 (Alpha) variant in our model.

Conclusion

Through continuous development, enhancement and calibration efforts, the epidemiological model has yielded a valuable contribution in modelling the trend of the SARS-CoV-2 pandemic in Canada. Specifically, findings from this model have allowed the PPE Project to estimate the PPE demand across Canadian provinces to ensure that all sectors acquire sufficient PPE stocks in advance of large outbreaks.

Furthermore, this article demonstrates how applications of data science, combined with statistics, computer science and epidemiology, can be utilized in public health planning as well as decision making for resource requirements during the COVID-19 pandemic.

How was this achieved?

Areas of further study

Given that SARS-CoV-2 is still an on-going pandemic, there may be more work that needs to be done. Some potential future areas of study include:

  • New variants
    With the high rate of mutation observed in the SARS-CoV-2 strain, new variants are constantly sequenced around the world. While the effect of the B.1.1.7 variant has been considered in the model, there are still several other VOCs that may need to considered (e.g., Delta variant). The team is closely monitoring the spread of VOCs across the country to determine if other variants need to be taken into account in the model.
  • Waning immunity
    Studies have shown that immunity gained from vaccination (or infection) does not last indefinitely. Immunity will wane over time, causing a progressive loss of protective antibodies. This phenomenon is called waning immunity. This will need to be taken into account in the model to prepare for a future scenario, such as when a large portion of the population will require another dose of vaccination to maintain their immunity.

The PPE epidemiological modelling team:
Jihoon Choi (DScD), Deirdre Hennessy (HAD), Joel Barnes (HAD).

Project team and contributors:
Rubab Arim, Statistics Canada; Kayle Hatt, Health Canada

Date modified:

National Travel Survey: C.V.s for Visit-Expenditures by Duration of Visit, Main Trip Purpose and Country or Region of Expenditures – Q1 2021

National Travel Survey: C.V.s for Visit-Expenditures by Duration of Visit, Main Trip Purpose and Country or Region of Expenditures, including expenditures at origin and those for air commercial transportation in Canada, in Thousands of Dollars (x 1,000)
Table summary
This table displays the results of C.V.s for Visit-Expenditures by Duration of Visit, Main Trip Purpose and Country or Region of Expenditures. The information is grouped by Duration of trip (appearing as row headers), Main Trip Purpose, Country or Region of Expenditures (Total, Canada, United States, Overseas) calculated using Visit-Expenditures in Thousands of Dollars (x 1,000) and c.v. as units of measure (appearing as column headers).
Duration of Visit Main Trip Purpose Country or Region of Expenditures
Total Canada United States Overseas
$ '000 C.V. $ '000 C.V. $ '000 C.V. $ '000 C.V.
Total Duration Total Main Trip Purpose 4,388,384 B 3,584,773 A 348,650 E 454,962 C
Holiday, leisure or recreation 2,095,885 B 1,642,628 A 265,540 E 187,717 D
Visit friends or relatives 762,081 B 715,270 B 10,745 E 36,065 E
Personal conference, convention or trade show 30,793 E 29,511 E 849 E 433 E
Shopping, non-routine 267,241 B 267,241 B ..   ..  
Other personal reasons 679,391 C 456,425 B 67,117 E 155,850 E
Business conference, convention or trade show 31,336 E 9,449 E 123 E 21,764 E
Other business 521,658 B 464,249 B 4,276 E 53,133 E
Same-Day Total Main Trip Purpose 1,655,756 B 1,555,199 A 100,557 E ..  
Holiday, leisure or recreation 573,716 D 473,915 B 99,802 E ..  
Visit friends or relatives 278,527 B 277,949 B 578 E ..  
Personal conference, convention or trade show 27,842 E 27,842 E ..   ..  
Shopping, non-routine 242,842 B 242,842 B ..   ..  
Other personal reasons 281,383 B 281,205 B 178 E ..  
Business conference, convention or trade show 900 E 900 E ..   ..  
Other business 250,546 B 250,546 B ..   ..  
Overnight Total Main Trip Purpose 2,732,628 B 2,029,574 B 248,092 E 454,962 C
Holiday, leisure or recreation 1,522,168 B 1,168,713 B 165,738 E 187,717 D
Visit friends or relatives 483,554 B 437,321 B 10,167 E 36,065 E
Personal conference, convention or trade show 2,952 E 1,669 E 849 E 433 E
Shopping, non-routine 24,399 E 24,399 E ..   ..  
Other personal reasons 398,009 D 175,220 B 66,939 E 155,850 E
Business conference, convention or trade show 30,435 E 8,548 E 123 E 21,764 E
Other business 271,112 C 213,703 C 4,276 E 53,133 E
..
data not available

Estimates contained in this table have been assigned a letter to indicate their coefficient of variation (c.v.) (expressed as a percentage). The letter grades represent the following coefficients of variation:

A
c.v. between or equal to 0.00% and 5.00% and means Excellent.
B
c.v. between or equal to 5.01% and 15.00% and means Very good.
C
c.v. between or equal to 15.01% and 25.00% and means Good.
D
c.v. between or equal to 25.01% and 35.00% and means Acceptable.
E
c.v. greater than 35.00% and means Use with caution.