Web scraping during the COVID-19 pandemic

As society continues to face rapid changes as a result of COVID-19, it is more important than ever to provide Canadians with reliable and timely data that support informed decision-making. Under the current circumstances, relying solely on traditional methods to gather information presents a growing challenge.

Statistics Canada is supporting the information needs of Canadians, as well as the Public Health Agency of Canada and other frontline agencies, to help track and manage COVID-19. To ensure timely information is available, Statistics Canada is using web scraping techniques to gather relevant data from a variety of websites on COVID-19.

Web scraping is a technique that allows Statistics Canada to provide the Government of Canada with current, essential information on COVID-19 and its impact on society and the economy. This method is an efficient way to gather information that would otherwise be collected manually, resulting in time-consuming processes and potential errors.

Statistics Canada's web scraping best practices follow the agency's commitment to openness and transparency by notifying the relevant websites that web scraping activities will be taking place. However, because of the speed at which data are required during the COVID-19 pandemic, Statistics Canada will not notify public sector websites—such as federal, provincial, territorial and regional government websites—of web scraping activities. The agency will continue to notify the websites of non-public sector organizations when it intends to perform web scraping activities.

Statistics Canada has been using and testing web scraping methods in accordance with its necessity and proportionality framework. The agency is ensuring that the use of web scraping follows its rigorous statistical, privacy protection and ethical standards. No personal information is accessed or collected during this process.

The agency is committed to producing statistical products that are adapted to meet the urgent needs of Canadians.

Date modified: