Audit of Quality Assurance - Consumer Price Index

November 2024
Project number: 80590-139

Table of contents

Executive summary

The Consumer Price Index (CPI) program is a mission critical program necessary to meet Statistics Canada's core legislated mandate. The CPI is often used as a general indicator of inflation in Canada and is widely known, quoted and trusted by Canadians.

Monthly production of the CPI follows an iterative data flow process with validation and quality checks occurring at various points throughout. To meet the preannounced publication dates, the CPI program must follow a tight production schedule. Additionally, because of its extensive use for indexation purposes, the CPI is not subject to revision.

The CPI is measured based on a basket of consumer goods and services. Weights are assigned to product groups in the CPI basket through a basket update process that occurs annually. These basket weights are then applied when calculating the monthly CPI.

The agency has developed a structured approach to achieving high-quality statistical outputs as described in its Quality Assurance Framework and complementary Quality Guidelines. Together, these foundational tools define the elements of quality pertaining to the production of official statistics. While practical guidelines and checklists exist, staff expertise, judgment and unique skill sets remain essential—all activities within the production process must show a concern for quality.

The agency has also developed the Directive for the Validation of Statistical Outputs and the accompanying Guidelines for the Validation of Statistical Outputs, which aim to strengthen quality assurance and support the quality of statistical outputs. These instruments provide guidance and practical advice regarding validation—a final and very important quality check that statistical outputs must pass prior to dissemination. This final check challenges the validity of the statistical information to ensure there are no design flaws or execution errors left undetected.

Lastly, the International Monetary Fund's Consumer Price Index Manual: Concepts and Methods provides a comprehensive overview of the methods and practices that national statistical organizations should consider. The international standards contained within provide guidelines on best practices and promote the quality and international comparability of national CPIs.

Why is this important?

The CPI is relied upon by various stakeholders such as governments, businesses, the Bank of Canada, the System of National Accounts and academia. One of the core principles underpinning the agency's statistical programming is quality—a commitment that helps to foster and sustain the trust of Canadians and agency stakeholders. This audit was conducted due to the importance of quality to the agency's success.

Overall conclusion

Management has implemented an adequate quality control framework to ensure the high quality of CPI outputs and the consistent application of related quality assurance processes. Quality assurance mechanisms are in place throughout the data flow process and embedded in the direction of the program. Dedicated staff present themselves as well seasoned and knowledgeable about their roles, responsibilities and accountabilities, and can articulate those of their colleagues and peers alike. Some opportunities for improvement were noted in the areas of adherence to validation guidelines, tools and guidance, and risk management.

Key findings

Validation of monthly outputs

Various quality assurance activities are being performed by the CPI program to validate its monthly statistical outputs, and the results are being discussed through appropriate governance mechanisms that include senior management. Relevant roles, responsibilities and accountabilities are mostly documented and well understood, and employees are generally supported by processes and procedures that direct the validation of monthly CPI outputs. The CPI program has not formally documented a validation strategy for its monthly outputs, and formal validation reports are not prepared, as required by agency guidelines.

Quality assurance within the annual basket update process

Numerous quality assurance and validation activities are being performed by the CPI program during a basket update. These activities are being documented, the results are being reported to program management and suitable governance processes are overseeing the quality assurance of basket updates. Roles, responsibilities and accountabilities for quality assurance within the basket update process are documented and understood by most, and supporting processes and procedures are in place—though some limitations were observed. A comprehensive validation strategy has not been developed for basket updates.

Quality assurance over the receipt and preparation of alternative data sources

The CPI program consistently performs various quality assurance activities when ingesting alternative data sources, and governance processes are in place to oversee the quality assurance of alternative data sources used in the production of the CPI. Roles, responsibilities and accountabilities for quality assurance of alternative data sources are documented and well understood. Relevant processes and procedures are well documented yet evolving, and some key processes require refinement for the program to reach full maturity in machine learning operations.

Risk management

CPI management and staff have a general awareness of program risks and some ad hoc risk management activities are occurring. However, risk management processes are not formally documented for the CPI program.

Conformance with professional standards

The audit was conducted in accordance with the Mandatory Procedures for Internal Auditing in the Government of Canada, which include the Institute of Internal Auditors' Global Internal Audit Standards.

Sufficient and appropriate audit procedures have been conducted, and evidence has been gathered to support the accuracy of the findings and conclusions in this report and to provide an audit level of assurance. The findings and conclusions are based on a comparison of the conditions, as they existed at the time, against pre-established audit criteria. The findings and conclusions are applicable to the entity examined and for the scope and period covered by the audit.

Steven McRoberts
Chief Audit and Evaluation Executive

Introduction

Background

The Consumer Prices Division (CPD) is mandated to produce timely and relevant data on consumer price change over time and across geographical areas in Canada. To accomplish this, the CPD produces and publishes the Consumer Price Index (CPI) monthly. The objectives of the CPI program are to provide reliable and timely measures of consumer price change and high-quality information to data users within set service standards.

The CPI program is a mission critical program necessary to meet Statistics Canada's core legislated mandate. The CPI itself is an indicator of change in prices experienced by Canadian consumers. It is calculated by comparing, through time, the cost of a fixed basket of goods and services. The index is often used as a general indicator of inflation in Canada. It is widely known, quoted and trusted by Canadians. One of its most important uses is by governments, businesses and individuals to adjust selected contractual or legislated payments in line with inflation. Additionally, the CPI informs monetary policy, aids the Bank of Canada as it seeks to maintain inflation within its target range and enables the System of National Accounts to estimate gross domestic product in constant dollars. The CPI is released monthly, via The Daily,Footnote 1 within 31 days of the price observation period. The release is typically on the third week of the month following the price observation period.Footnote 2

Monthly production of the CPI follows an iterative data flow process whereby products and geographies are classified, a sampling strategy is determined, outlets are selected, and prices are collected. Once the datasets are gathered, they are reviewed, edited, imputed, and quality adjusted. Afterward, the elementary price indexes are calculated and aggregated, the results are analyzed and the monthly CPI statistics are disseminated. Validation and quality checks occur at various points as the data flows through the production process. To meet the preannounced publication dates, the CPI program must follow a tight production schedule. Additionally, because of its extensive use for indexation purposes, the CPI is not subject to revision.

The CPI is measured based on a basket of consumer goods and services. The CPI basket is classified using eight product groupsFootnote 3 (e.g., food), which are made up of 187 basic aggregates (e.g., gasoline and fresh fruit) and 490 elementary aggregates (e.g., apples). Elementary aggregates are added or deleted from the basket as consumption patterns change over time. Weights are assigned to product groups in the CPI basket through a basket update process. These basket weights are then applied when calculating the monthly CPI.Footnote 4

The agency has developed a structured approach to achieving high-quality statistical outputs, as described in its Quality Assurance Framework (2017) and complementary Quality Guidelines (2019). Together, these foundational tools define the elements of quality pertaining to the production of official statistics and support program managers in ensuring high-quality data production processes.Footnote 5 While practical guidelines and checklists are provided, they are not intended to replace the expertise and judgment of the staff responsible for producing data. Moreover, all activities of every production process must show a concern for quality. All staff involved in statistical activities are responsible for ensuring that quality has high priority when designing and implementing statistical methods and procedures.

The agency has also developed the Directive for the Validation of Statistical Outputs (2015) and accompanying Guidelines for the Validation of Statistical Outputs (2015), which aim to strengthen quality assurance and support the quality of statistical outputs. These instruments provide guidance and practical advice regarding validation—a final and very important quality check that statistical outputs must pass prior to dissemination. This final check challenges the validity of the statistical information to ensure there are no design flaws or execution errors left undetected.

Lastly, the International Monetary Fund's Consumer Price Index Manual: Concepts and Methods (2020) provides a comprehensive overview of the methods and practices that national statistical organizations should consider. The international standards contained within provide guidelines on best practices and promote the quality and international comparability of national CPIs.

Audit objective

The objective of this audit was to provide reasonable assurance that management has established an adequate quality control framework to ensure the high quality of CPI outputs and the consistent application of related quality assurance processes.

Scope

The scope of the engagement included an examination of select components of the CPI program's quality control framework, including governance and oversight, roles and responsibilities, established quality assurance processes that are supported by tools and guidance, and the application of such processes.

Field work in relation to the above included the following specific areas:

  • activities performed to validate CPI outputs before they are released to the public
  • quality assurance within the annual basket update process
  • quality assurance over the receipt and preparation of alternative data sources used in the production of the CPI.

A sample of quality assurance processes were selected for testing to confirm they were functioning as intended. The focus was on specific activities being performed by the CPI program to ensure the high quality of statistical outputs. This audit did not include the reperformance of any CPI calculations nor assess or comment on the accuracy or validity of actual statistical outputs released by the CPI program.

With respect to audit scope, use of the term "CPI outputs" or "statistical outputs" includes The Daily text and all supporting Common Output Data Repository tables published online for public consumption in accordance with the CPI's official monthly release schedule. Also included is the table of CPI basket weights that is published online and updated annually. The CPI program also publishes monthly average retail prices tables, a data visualization tool, and assorted articles, reports, journals, and periodicals. These additional publications were not included within the scope of this audit.

The period under review was January 1, 2023, to June 30, 2024.

Approach and methodology

Field work consisted of

  • interviews and walkthroughs with key management and staff
  • examination and analysis of relevant documentation
  • testing of a sample of relevant quality assurance processes
  • review of current policies, guidelines and standards
  • review of select committee meetings agendas and records of decision.

Authority

The audit was conducted under the authority of the approved Statistics Canada Integrated Risk-based Audit and Evaluation Plan (2023/2024 to 2027/2028).

Findings, recommendations and management response

Validation of monthly outputs

Various quality assurance activities are being performed by the CPI program to validate its monthly statistical outputs, and the results are discussed through appropriate governance mechanisms that include senior management. Relevant roles, responsibilities and accountabilities are mostly documented and well understood, and employees are generally supported by processes and procedures that direct the validation of monthly CPI outputs. The CPI program has not formally documented a validation strategy for its monthly outputs, and formal validation reports are not prepared, as required by agency guidelines.

In a statistical process, validation is the set of activities that ensures that weighted estimates and aggregate statistics are reliable, sound and defensible. It includes the processes used to identify and correct inconsistencies in microdata and macrodata using diagnostic tools and subject matter expertise. Validation activities therefore assess the quality of a statistical output in terms of its accuracy, coherence, and overall reasonableness. Statisticians validate the quality of the outputs produced, in accordance with a general quality framework, relative to expectations based on their cumulative knowledge of the specific statistical domain. Validation is meant to challenge rather than rationalize estimates. Therefore, it is important that unusual results or movements are understood and that atypical estimates are a trigger for more detailed investigation.

Statistics Canada's Directive for the Validation of Statistical Outputs (the Directive) makes the necessary provisions to ensure that validation is carried out and documented on all of the agency's statistical outputs. It requires that all mission critical programs perform the validation steps described as per the agency's Guidelines for the Validation of Statistical Outputs (the Guidelines) unless a justification can be given as to why the step could not be completed. As a mission critical program, the CPI is expected to comply with both the Directive and the Guidelines.

The Guidelines state that directors "shall ensure that each program in their Division has a validation strategy in place which meets the requirements of the Directive [for] the Validation of Statistical Outputs." A validation strategy details a statistical program's planned validation activities to be conducted and the specifics on what will be done (e.g., which files to confront or which external partners will be implicated). Each program's validation strategy should be documented within a validation report, though no specific format is required for the report. In most instances, the validation strategy component of the validation report will remain stable over time and will need to be updated only as changes are made to the validation measures used. Program directors are expected to review each program's validation strategy at least every three years to ensure the validation activities being conducted are sufficient to mitigate risks to data quality.

The Guidelines further require the validation report to be updated each production cycle to record the results of the program's latest validation activities. If perceived or actual anomalies were found, these discrepancies, the investigation that was conducted, and either the correction strategy implemented or the reason for accepting the discrepancy must be documented. The report must only capture the main points or the significant elements of the validation process but offer enough detail so that it is clear why the data quality conclusion has been reached.

The CPI program regularly performs many of the quality assurance activities for validating monthly CPI outputs as required by the Guidelines. Governance processes are in place to oversee the validation of statistical outputs.

The Guidelines divide validation activities into eight subject matter-based activities and two process-based activities. The subject matter-based validation activities are (1) analysis of change over time, (2) verification of seasonally adjusted estimates, (3) verification of estimates through cross-tabulation, (4) coherence analysis based on known current events, (5) confrontation with other similar sources of data published by Statistics Canada, (6) consultation with stakeholders internal to Statistics Canada, (7) participation in the Daily Forum and (8) formal briefing to the Executive Governance. The process-based validation activities are (1) review of production processes and (2) coherence analysis based on quality indicators (see Appendix B for a description of validation activities).

The audit team learned that responsibilities for each of the eight product groups within the CPI basket are assigned to five unitsFootnote 6 within the CPD's Production Subdivision. Further, the completion of some validation activities is a joint responsibility among units within the Production Subdivision and the Analysis and Dissemination Units, while others are the sole responsibility of one unit within the CPD. Consequently, five of eight product groups and relevant units within the Production Subdivision were sampled alongside the Analysis and Dissemination Units to assess the application of key validation steps within the Guidelines for three randomly sampled reference months. All units selected were assessed based on preexisting responsibilities and accountabilities for ten key validation steps (listed above). Overall, of the eight required subject matter-based validation steps, five were determined to be fully compliant and three were compliant with condition. As for the two process-based steps, both were deemed non-compliant (details below).

The audit found that—each month—all units complete monthly analysis reports that document the price movements, patterns, trends and drivers for each assigned product group. Monthly index review meetings are then held between the production units and the Analysis Unit to present price movements for discussion and peer review, such as year-over-year and month-over-month analyses and key drivers of changes in price. The monthly analysis reports generated by the production units act to disseminate information and validate each individual product group at their respective aggregate level. The Analysis Unit may request additional information and subsequently perform activities to validate the all-items CPIFootnote 7 at the aggregate macro level. These monthly index review meetings include key CPI management and staff and act as a verbal sign-off for program managers to approve data at the aggregate index level. Additionally, the audit found that the CPI program is presenting their monthly pre-release results and obtaining verbal sign-off from the program and senior management—including the Strategic Management Committee—in alignment with key validation requirements (see Appendix B). The Strategic Management Committee is a Tier 1 committee that provides broad strategic direction for the agency and acts as the body for all decision-making related to the corporate-level management and governance of the agency.

Review of production processes

While ad hoc activities are being undertaken to address and resolve issues and challenges that arise, a proactive review of production processes by the production units sampled was not observed. That said, the designated unit head was able to articulate several instances where issues and challenges arose, and explain the steps and actions taken to address and resolve such issues. Therefore, actions are being taken to address production process issues, but are not actively monitored as part of the validation process.

Coherence analysis based on quality indicators

Interviews revealed that there are Government of Canada service standards for the CPI program, such as client acknowledgments must be sent within 24 hours and the monthly CPI release is required to be published within 31 days of the price observation period. Moreover, the CPD tracks and reports on web metrics such as the number of hits on specific websites; however, quality indicators for the CPI program do not exist beyond this capacity. CPI management confirmed that a coherence analysis based on quality indicators for the program is not being performed as per the Guidelines.

Roles, responsibilities and accountabilities for CPI monthly validation activities are mostly documented and appear to be communicated and well understood by relevant employees. Some processes and procedures are in place to direct the validation of CPI monthly statistical outputs.

Performance agreement work objectives for key CPI management and staff include some significant validation activities such as data verification, the verification and analysis of outliers, and the identification of processing issues. Likewise, job postings and statements of merit criteria recognize the need for some key validation activities.

Commodity units are responsible for generating and maintaining their own workload trackers. Each one is unique, without a common look and feel; however roles, responsibilities and accountabilities can be found to varying degrees, some limited, within each tracker. Commodity units disperse workload based on commodity class, and all activities for each commodity class are in most cases performed by the same analyst.

Discussions regarding the segregation of duties with key CPI staff found that it could add an additional layer of control to segregate the activities of conformity and acceptance and statistical review and validation; however, such segregation of duties may also add a layer of unnecessary complexity for staff. Currently, these activities are not being segregated for most commodity units as it is seen to be more beneficial for each analyst to build subject matter expertise within their respective commodity groups. This enables analysts to fluidly speak to price movements. Moreover, as the Analysis Unit performs an impartial review of the all-items CPI, there is an additional layer of oversight currently built into the validation process to catch any potential outliers in the data.

Interviews with key CPI management and staff revealed that there is a certain amount of training and expertise that can be gained via on-the-job shadowing, which is occurring when new employees are on-boarded. Additionally, most employees felt and were clearly able to demonstrate they understood their roles, responsibilities and accountabilities, as well as those of their colleagues, and that these roles, responsibilities and accountabilities were communicated to them.

Of the tools and guidance requested by the five production units sampled and the Analysis and Dissemination Units, most provided ample documentation to support the discharge of their responsibilities as it relates to validation activities. All production units have templates to generate monthly index analysis reports. The Analysis Unit has a guide that provides direction on how to produce the monthly CPI release in The Daily and prepare and present analytical material to division and agency management. Moreover, the Dissemination Unit uses a CPI monthly task list to track activities performed to release monthly CPI outputs and a CPI data validation guide, which contains steps to download, format and validate applicable CPI Common Output Data Repository tables against system reports.

While documentation exists, some opportunities for improvement were noted as it relates to the breadth and depth of detail, coverage of all validation activities as per the Guidelines, target audience, and continued relevance or applicability. For example, some guidance on performing data validation at the micro level can be found in the CPI Production Manual, though the guide omits validation activities at the aggregate level and was last updated in 2014, which limits its relevance.

Ample and extensive training documentation exists providing CPI employees with general information and guidance on data flows, objectives, program structure, etc. However, the documentation could provide specific guidance on CPI monthly validation activities, notably at the aggregate level.

The consensus among interviewees—while gaps exist (e.g., a lack of onboarding documentation and step-by-step procedural guidance for some units)—was that documentation generally exists for the CPI program. Yet, documentation is fragmented, and sifting through it to find what is needed proves challenging at times. Moreover, some of the skills and expertise needed to treat and analyze data at the micro level, as well as generating observations and building an understanding of price movements for a specific commodity group at both the micro and macro levels were reported as requiring a certain degree of on-the-job training. Although it is understandably not formally documented, in some instances, it was expressed that some of this training could be.

A formal validation strategy for monthly CPI outputs has not been developed, and formal validation reports are not being prepared for each production cycle.

The audit found that a formal validation strategy does not exist and that monthly validation reports are not being generated for the CPI program. The purpose of the validation strategy and validation reports is to ensure all required validation activities have been integrated into the program, are functioning as intended and are being monitored and evaluated. Validation reports specifically act to record and communicate the outcomes of investigations. Moreover, they can speak to and track potential patterns and trends of the monthly CPI data. Quality indicators for the program should be reported and align with the validation reports and the validation strategy.

Quality assurance within the annual basket update process

Numerous quality assurance and validation activities are being performed by the CPI program during a basket update. These activities are documented, the results are reported to program management and suitable governance processes are overseeing the quality assurance of basket updates. Roles, responsibilities and accountabilities for quality assurance within the basket update process are documented and understood by most, and supporting processes and procedures are in place, though some limitations were observed.

The CPI is a weighted average of the price changes of a fixed basket of goods and services, based on the expenditures of a target population in a given reference period. In order to be representative of the price change experienced by Canadians, the basket weights must be representative of how Canadians are spending their money. In 2021, the basket update process was changed from a multi-year update to an annual update to make the CPI basket more representative of Canadian spending habits. While more frequent basket updates should theoretically reduce substitution and new goods biases, this change also comes with increased levels of effort and introduces more opportunities for errors.

The audit team learned that the annual basket update process is complex and involves many quality assurance and validation activities requiring specialized knowledge and skill sets, which are performed by various teams and specific key individuals across the CPD. There are essentially two high-level groups of quality assurance and validation activities within the annual process: (1) quality assurance and validation of the new basket weight values, and (2) quality assurance and validation of the integration of the new basket weights within CPI production and dissemination systems. The numerous quality assurance and validation activities performed fall within these two groups and are mostly documented within a basket update schedule maintained by the CPD.

The CPI program performs numerous quality assurance and validation activities during a basket update, many of which align with the Guidelines. Governance processes are in place to oversee the quality assurance of basket updates.

The audit team selected a sample of 15 unique quality assurance and validation activities from the most recent basket update schedule. Supporting documentation demonstrating the performance of each step was requested and provided. The audit found that all 15 steps selected were completed by the appropriate teams and individuals to the best of their understanding.

The audit also found that the CPI program is compliant with expectations for documenting statistical metadata. There is a sufficient level of information available to Canadians about basket weights and the basket update process. This includes primary and alternative data sources used, i.e., supplementary information necessary to appropriately understand, analyze and use the statistical information is sufficiently available.

Further, there are multiple levels of oversight over the annual basket update process. The Basket Update Improvements Working Group, the Basket Update Management Committee and the CPI Steering CommitteeFootnote 8 all play a role in overseeing the yearly update to the basket. The objective of the working group is to identify areas of improvement in CPI basket update methods and processes, help plan work to implement improvements and recommend improvements to program management. Improvement activities are being undertaken and tracked. The management committee is attended by program management up to the director level. This committee reviews and approves changes to basket update methods and processes. Certain items are presented to the CPI Steering Committee, as required. A review of agendas and meeting minutes demonstrate that structured discussions are being had at various levels within the CPD to adequately oversee the quality assurance of basket updates.

Roles, responsibilities and accountabilities for key personnel relevant to quality assurance within the basket update process are mostly documented and understood. Processes and procedures are mostly in place to direct quality assurance within the basket update process.

The audit found that the primary guiding document for assigning roles and responsibilities is the basket update schedule maintained by the CPD. In addition, relevant responsibilities are included in work objectives for most of the teams and individuals involved. In most cases, individual roles, responsibilities and accountabilities were reported during interviews to be well understood, though one key analyst role was found to be undefined. Interviews also revealed some confusion around roles and responsibilities in relation to validating the additivity of special aggregates within the basket weights table. The Analysis and Dissemination Units indicated that it is not clear who is responsible for performing this validation check. Consequently, this year, several miscalculations were overlooked during earlier checks and were only discovered closer to the release of the new basket weights than desired.

Some procedural documentation was provided by most of the teams involved in the annual basket update process. In general, the audit found that the documentation provides sufficient guidance to team members in relation to their individual quality assurance and validation activities. However, there was acknowledgement that documentation for the Index Modelling Unit could be improved (e.g., there is no preexisting documentation describing—for each quality assurance activity performed—what is expected, what constitutes pass or fail, or how to document and communicate results). Similarly, the Analysis and Dissemination Units and one key analyst acknowledged the same gap in terms of detailed documentation.

A comprehensive validation strategy has not been developed for basket updates. A list of various quality assurance and validation activities and their results are documented and reported to program management, though there are some limitations.

As with the monthly CPI (as discussed above), the audit team understood that the annual basket update would also be subject to validation requirements, because the CPI basket weights table is a unique statistical output and is published independently, via The Daily, from the standard monthly CPI outputs. Therefore, the standard CPI monthly validation process would not cover the basket update itself.

Accordingly, the audit team requested a documented validation strategy for the annual basket update process. It was informed that the validation work completed is documented through the presentations that are provided to program management to facilitate their review and approval of the new basket weights via a series of certification meetings. The audit team reviewed the presentation decks provided and observed that they include a cumulative list of meeting focus items. The focus items included a review of the algorithm to be used as a quality assurance tool, Production Subdivision input regarding important outliers, a review of outstanding outliers, a summary of the weight review process and a review of relative shares. The audit team determined that this list could be considered a limited validation strategy. Even so, the final list cannot be mapped to all the key validation steps as per the Guidelines. Likewise, the basket update schedule is a reasonably detailed tool that explicitly assigns responsibilities for some key quality assurance and validation activities to various teams and individuals involved in the annual basket update process; however, it too cannot be mapped to the required validation steps for mission critical programs.

Regarding a validation report to record and communicate the results of the annual quality assurance and validation activities, the audit team was informed that complete documentation exists on all comments received, adjusted or not adjusted, the number of changes made, the net impact on each major aggregate, techniques used to make adjustments, and the actual amount of each adjustment made. The audit team examined the documentation provided and determined that it conceptually includes the components noted. Also, the results within the documentation provided are discussed during the certification meetings with program management (noted above), though records of the discussions are not kept. As with the validation strategy, however, the documentation provided is considered limited in terms of a validation report, because it cannot be easily or entirely mapped to all the validation steps required per the Guidelines.

Given the complexity and broad scope of participants involved in quality assurance and validation activities within the annual basket update process, a formal, comprehensive validation strategy would better support program management in clarifying roles, responsibilities and accountabilities. Further, without adequate procedural documentation supporting key roles and responsibilities, changes in personnel or oversight could cause certain quality assurance and validation activities to not be performed adequately or at all, leading to potential errors in the basket weights that are publicly released.

Recommendation

It is recommended that the assistant chief statistician, Economic Statistics, ensure that

  1. a comprehensive validation strategy is formally documented for the CPI program encompassing (1) monthly CPI outputs, including the tracking of program quality indicators, and (2) annual basket updates and formal validation reports prepared in alignment with requirements prescribed for mission critical programs as per the Guidelines for the Validation of Statistical Outputs.

Management response

Management agrees with the recommendation.

The CPI program will formally document its validation strategy, including the annual basket update, and align it with the corporate guidelines on the validation of statistical outputs for mission critical programs.

Deliverables and timelines

The director general, Industry Statistics, will

  1. formalize monthly CPI and annual basket update validation strategy document in alignment with the Guidelines for the Validation of Statistical Outputs by April 2025
  2. formalize cyclical validation reports in alignment with the Guidelines for the Validation of Statistical Outputs by September 2025.

The director general, Industry Statistics, in collaboration with the director general Modern Statistical Methods and Data Science, will

  1. establish preliminary quality indicators for price indexes and apply measures to the CPI program by April 2025.

Recommendation

It is recommended that the assistant chief statistician, Economic Statistics, ensure that

  1. roles, responsibilities and accountabilities pertaining to validation activities—in relation to (1) monthly CPI outputs and (2) annual basket updates—are reviewed and updated, as needed, and detailed processes and procedures are established as required.

Management response

Management agrees with the recommendation.

As part of the development of a comprehensive validation strategy, the CPI program will improve processes and procedures and clarify roles and responsibilities pertaining to these validation activities. This will include activities which have roles and accountabilities across multidisciplinary teams at various stages of the CPI production cycle.

Deliverables and timelines

The director general, Industry Statistics, will

  1. formally document assigned responsibilities, accountabilities and approval mechanisms for monthly CPI and annual basket update validation activities by March 2025
  2. update processes and procedures relating to monthly CPI and annual basket update validation activities by May 2025.

Quality assurance over the receipt and preparation of alternative data sources

The CPI program consistently performs various quality assurance activities when ingesting alternative data sources, and governance processes are in place to oversee the quality assurance of alternative data sources used in the production of the CPI. Roles, responsibilities and accountabilities for quality assurance of alternative data sources are documented and well understood. Relevant processes and procedures are well documented yet evolving, and some key processes require refinement for the program to reach full maturity in machine learning operations.

Statistics Canada continually strives to modernize and streamline data collection methods. Within the CPI program, alternative data sources, such as retail scanner data and web-scraped data, have been gradually replacing traditional field collection (i.e., interviewers visiting retail stores in person to collect price observations). Presently, about half of all price collection data—representing approximately 20% of the overall CPI basket—are collected through some form of alternative data source, with a goal to increase price collection data to 75%. The agency continues to work with potential new data providers to expand the use of alternative data sources.

This progressive increase in the use of alternative data sources has altered CPI production processes and methodologies, as well as data ingestion systems, which require appropriate capacity and resources to receive and prepare large datasets. Like many national statistical organizations, Statistics Canada's CPI program leverages supervised machine learning algorithms to effectively process these datasets.

The audit team observed that CPI program data scientists are key contributors to the United Nations' Committee of Experts on Big Data and Data Science for Official Statistics. Their task team on scanner data has developed a handbook on using new data sources in the production of consumer price statistics. Canada is also a member of the committee's advisory board and chair of its technical delivery board. CPI data scientists have published several articles and studies that have contributed to global advancement for CPI programs within the fields of alternative data sources and machine learning. Nevertheless, these are emerging disciplines for national statistical organizations and, consequently, international standards are not yet defined. The audit team was informed that most national statistical organizations are still researching and collaborating on methodologies in these areas.

The CPI program consistently performs various quality assurance activities when ingesting alternative data sources. Governance processes are in place to oversee the quality assurance of alternative data sources used in CPI production.

According to publications and research emerging from Statistics Canada and other national statistical organizations, several key processes can contribute to the quality of alternative data sources used in the CPI. These processes include obtaining data via negotiation with providers, global quality checks, detailed quality checks, machine learning models with manual validation, and subject matter quality assurance.

A judgmental sample of three forms of alternative data sources used by the CPI program was selected for detailed examination based on risk, prominence of use within specific product groups and the materiality of their basket weights, such that sampling covered approximately 50% of the CPI basket (for reference year 2022). The three forms of alternative data sources selected were (1) retail scanner data used for the food product group, (2) web-scraped data used for the clothing and footwear product group, and (3) other alternative data used for mortgage interest costs within the shelter product group. Three randomly sampled reference months were also selected (see the Validation of monthly outputs section). Based on the audit sample tested, the audit found that all quality assurance procedures for ingestion (e.g., global and detailed quality checks and manual validation of machine learning classifications) were consistently applied in accordance with documented procedures.

The audit team additionally observed that formal written agreements are in place with the major retailers who provide scanner data used in the production of the CPI. The International Monetary Fund's Consumer Price Index Manual: Concepts and Methods outlines this as a good practice for ensuring that quality and timeliness needs are met. Statistics Canada also has a contract in place with a third party to perform web scraping. The contract includes a detailed statement of work to ensure that the data received meet the agency's needs for quality and timeliness and that the agency provides retailer-specific instructions to the contractor to ensure alignment with the CPI classification structure. Employees involved in quality assurance processes for alternative data sources had no consequential concerns with respect to data timeliness.

The audit found that the CPI Steering Committee is actively overseeing the CPI program's expanding use of alterative data sources. The committee receives regular progress updates on the incorporation of new alternative data sources and the ingestion and processing of existing alternative data sources. Presentations to the committee include related issues, risks and resolutions. Information on quality assurance processes for retail scanner and web-scraped data, including machine learning operations, as well as ongoing work and accomplishments in these areas are also included.

Roles, responsibilities and accountabilities of key personnel involved in the quality assurance of alternative data sources are documented and well understood. Processes and procedures to direct the quality assurance of alternative data sources during ingestion are well documented yet evolving.

The audit found that roles, responsibilities and accountabilities related to quality assurance activities required for ingesting alternative data sources, including machine learning operations, were documented in a variety of individual procedural guidance. Interviews with key program management and staff also indicated that roles and responsibilities are well understood—there was no uncertainty or overlap expressed or noted.

Numerous procedural documents are used in the CPI program for processing retail scanner data, web-scraped data and other alternative data sources used in production. All key sub-processes were found to have at least one written procedural document. Though these documents were mostly developed at the team or unit level, they were deemed sufficiently detailed to assist relevant employees in the discharge of their responsibilities. Although procedural documentation delineating alternative courses of action in the event of alternative data limitations was not available, relevant employees were able to articulate solutions and provide examples of instances where limitations occurred and how they were effectively resolved.

Statistics Canada has additional guidance documentation in place for the use of alternative data sources such as the Directive on Web Scraping and a comprehensive public-facing document called Shelter in the Canadian CPI: An Overview (2023). The latter outlines the concepts and practices related to the shelter product group, including the methodology and alternative data sources used for mortgage interest costs. The agency does not yet have an overarching position paper or directive on scanner data, however. The Consumer Price Index Manual: Concepts and Methods recommends national statistical organizations publish a position paper articulating how they will proceed with the use of scanner data to compile the CPI, including data sources and methods to be employed.

Processes for machine learning operations have not yet reached full maturity.

For retail scanner data and web-scraped data, the CPI program is leveraging supervised machine learning to help classify and match new unique products within large datasets to the appropriate product groups within the CPI basket structure. According to the United Nations Economic Commission for Europe's Machine Learning for Official Statistics (2021), the key quality assurance processes for machine learning operations consist of (1) model training, (2) model performance monitoring and (3) model retraining.

The audit team assessed each key quality assurance process through interviews and examination of documentation. Overall, the audit found that some procedural documentation was available to guide model training and performance monitoring and retraining activities. However, interviewees acknowledged that no official agency directive exists and international standards are still under development, as machine learning operations is still an emerging area for all national statistical organizations. Key employees noted that the model training process was straightforward—model training data are prepared when new alternative data sources are acquired, prior to the data being used in the production of the CPI. By contrast, interviews confirmed that model performance monitoring and model retraining are presently more ad hoc—the frequencies for these activities have not been officially defined, though a study published by CPI data scientists identified the ideal frequency for retraining as every three to six months.

The audit team observed that CPI data scientists conducted a robust stratified random sample of scanner data for one of the food retailers in 2023. The resulting paper recommended further work be undertaken, including investigating poor performing product classes, implementing a random stratified sample of other food retailers and measuring model performance on an ongoing basis. To date, no additional random sampling campaigns have been conducted.

Further, manual processes are in place to validate machine learning classification outputs. However, while these processes were deemed effective, they are labour intensive and may limit scalability for additional alternative data sources and applications within the CPI program.

Machine learning is a key pillar of the alternative data sources used in the production of the CPI and continued expansion in this area. Yet, machine learning models can quickly decay once deployed, emphasizing the importance of new unique products having consistent quality assurance processes in place to monitor and retrain the models used to help classify new unique products. The agency's long-term goal should be to achieve full maturity for its machine learning operations within the CPI program to ensure the quality, reliability, consistency and scalability of automated classification models and supporting processes. It would include the implementation of accepted leading practices and effective controls over model monitoring.

Recommendation

It is recommended that the assistant chief statistician, Economic Statistics, ensure that

  1. a comprehensive framework with detailed guidance and standards for the use and application of alternative data sources within the CPI program, including machine learning operations, is developed and implemented.

Management response

Management agrees with the recommendation.

The CPI program will elaborate detailed guidance and standards for the use and application of alternative data sources. This will include how the CPI adheres to international guidelines, what is the desired end state, the operational flow, underlying corporate architecture, best practices, roles and responsibilities surrounding alternative data ingestion and use in CPI production.

Deliverables and timelines

The director general, Industry Statistics will

  1. develop a documented framework with detailed guidance and standards on the use and application of alternate data sources within the CPI program by February 2025
  2. identify frequency of deployment of updated machine learning model and implement by April 2025
  3. formalize the development of machine learning model performance metrics and determine manual review thresholds by July 2025.

Risk management

CPI management and staff have a general awareness of program risks and some ad hoc risk management activities are occurring. However, risk management processes are not formally documented for the CPI program.

Risk management processes are not formally documented.

Statistics Canada's Integrated Risk Management Policy and Integrated Risk Management Framework apply to all corporate, investment, policy, program, project, operational, resource and financial management activities. These instruments are in place to support the integration of risk management into all decision-making processes at all agency levels. They also empower a culture of responsible risk-informed decision making that contributes to the achievement of agency objectives and the improvement of outcomes. Effective risk management can equip a statistical program to respond actively to change and uncertainty by using risk-based information to enable more effective decision making.

Within the agency's risk management policy suite, executives and managers are accountable for providing leadership and direction to their staff in identifying, assessing and managing relevant risks in their plans, programs, projects and operational activities. They also ensure that risks are communicated and reviewed appropriately within the respective governance structure and conveyed to the appropriate internal or external partners and stakeholders. They encourage good risk management practices and are responsible for reporting on risk responses.

Given the mission critical nature of the CPI, key risks to the program's objectives—including emerging risks (e.g., changing methodologies and systems)—should be actively identified and assessed, and the results of this work should be formally documented. This would include assessing each risk for potential impact and probability of occurrence, and documentation of an action plan to mitigate, transfer, avoid or accept the risk.

Interviews with key CPI management and staff revealed that they have a general awareness of risks to the CPI program; however, no formal risk management documentation was available. The audit team also observed that the CPD's vision and work plan are not directly linked to a formal risk assessment. Nonetheless, some ad hoc risk management activities are occurring and informal risk mitigation strategies were described during interviews—e.g., regular knowledge transfer activities are taking place. Additionally, the Basket Update Improvements Working Group has documented issues and mitigation plans relevant to the basket update, though this work is not linked to a formal risk framework for the basket update process itself or for the CPI program as a whole.

A risk management framework would ensure that key risks to the CPI program's objectives are being actively identified, monitored, and mitigated or actioned, as deemed necessary. This would in turn help ensure the program's continued delivery of reliable and timely statistical outputs. Further, considering the substantive and ongoing change in the program, formalized risk management practices would better support CPI management in the prioritization of future change initiatives.

Recommendation

It is recommended that the assistant chief statistician, Economic Statistics, ensure that

  1. a formal risk management framework is established for the CPI program, and roles, responsibilities and accountabilities for risk management are formally documented and well understood.

Management response

Management agrees with the recommendation.

The CPI program will develop a formal risk profile which adheres to Statistics Canada's Integrated Risk Management Policy and Integrated Risk Management Framework. The CPI program will use this profile as an input to decision making and planning, prioritizations, investments, and ongoing maintenance. This risk profile will be reviewed on an ongoing basis and remain an evergreen document.

Deliverable and timeline

The director general, Industry Statistics, will

  1. formalize a risk profile in alignment with corporate guidelines by January 2025.

Appendices

Appendix A: Audit criteria

Audit criteria
Audit criteria Policy instruments and sources

1.1 Governance and oversight processes are in place and operating as intended to oversee select quality assurance processes within the Consumer Price Index (CPI) program.

1.2 Roles, responsibilities and accountabilities for select quality assurance processes within the CPI program are well documented and understood by relevant employees.

1.3 Relevant employees are provided with the necessary tools and guidance to support the discharge of their responsibilities with respect to quality assurance.

1.4 Select quality assurance processes are consistently applied to ensure the high quality of CPI outputs.

  • Statistics Canada's Quality Assurance Framework
  • Statistics Canada's Quality Guidelines
  • Statistics Canada's Directive for the Validation of Statistical Outputs
  • Statistics Canada's Guidelines for the Validation of Statistical Outputs
  • Statistics Canada's Directive on Documenting Statistical Metadata
  • Statistics Canada's Integrated Risk Management Policy
  • Statistics Canada's Integrated Risk Management Framework
  • International Monetary Fund's CPI Manual: Concepts and Methods
  • Other internal Statistics Canada documents, as applicable.

Appendix B: Validation activities

The Guidelines for the Validation of Statistical Outputs define the specific validation steps that all mission critical programs must follow, unless a justification can be given as to why the step could not be completed. Validation activities are divided into two groups:

  • subject matter-based activities
  • process-based activities.

The following subject matter-based validation activities are required of mission critical surveys:

  1. Analysis of changes over time: To analyze changes over time, a consistent time series of a particular statistic over a sequence of time points is created.
  2. Verification of seasonally adjusted estimates: For monthly or quarterly data presenting seasonally adjusted estimates, a validation of the results can be appropriate. Seasonal adjustment is designed to eliminate the effect of seasonal and calendar influences in infra-annual data to allow for more meaningful comparisons of economic conditions from period to period.
  3. Verification of estimates through cross-tabulations: This analysis is normally done at a finer level of disaggregation than that of the published estimates. Such tabulations allow checking of the internal consistency of the data file and provide the ability to explore the underlying characteristics associated with the estimates.
  4. Coherence analysis based on known current events: Coherence analysis based on known current events means validating estimates against domain intelligence and recent events affecting the sector.
  5. Confrontation with other similar sources of data published by Statistics Canada: Confrontation with other similar sources of data, either published by Statistics Canada or external to the agency can provide insight into whether reported microdata and aggregate estimates are reasonable.
  6. Consultation with stakeholders internal to Statistics Canada: Stakeholders who have either direct knowledge of the specific subject matter being studied or are experts in a related subject matter could be consulted.
  7. Participation in the Daily Forum: The Daily Forum serves as an interactive venue in which analysts involved in the release and interpretation of mission critical estimates meet to share information on major findings and trends, as well as exchange information on events or factors that are relevant to the interpretation of these data. It also provides all participants with information on the set of related indicators from different statistical programs that could be examined when performing quality assurance on specific data releases.
  8. Formal briefing to the Executive Governance: Each mission critical program must present their pre-release results to the Executive Governance (i.e., the Strategic Management Committee). The presentation includes an overview of any changes implemented or issues encountered during the production operations, the impact that those changes may have had on the estimates, and the risk mitigation strategy used. Program management issues must also be presented, outlining any increased risk of error caused by human resource or employee management concerns, along with the mitigation strategy used.

The following process-based validation activities are required of mission critical surveys:

  1. Review of production processes: Production processes include, for example, frame creation, sample design, collection, coding, editing, imputation, and weighting systems. They also include other interventions, such as the linking of administrative data or the calculation of derived variables.
  2. Coherence analysis based on quality indicators: The quality indicators associated with a program should be used to determine if an estimate is sound. For statistical surveys, quality indicators based on survey design, collection results and processing are used. Examples of quality indicators include but are not limited to imputation rates, metadata and paradata, and revision rates of estimate.