Workshops

Workshops will be held on Oct. 14, from 10am to 2pm, Eastern Daylight Time (EDT): UTC-4

Thursday, October 14, 2021

Workshop 1

Ethics and Privacy

Abstract

Protecting the privacy of Canadians is paramount for Statistics Canada. However, with the evolution of data collection methods and changes in the nature of the data themselves, new ethical challenges are emerging. This workshop on privacy and ethics includes a number of presentations on these challenges and several new initiatives launched by Statistics Canada to address them.

Biography

Guillaume Maranda works for the Data Ethics Secretariat at Statistics Canada, within the Quality Secretariat. He has a PhD in Philosophy and a Master's Degree in Statistics.

Martin Beaulieu joined Statistics Canada as a survey methodologist in 2001. He spent most of his career as a methodologist on various economic programs such as the CPI, tax data programs as well as surveys on energy and transportation. He has been the Chief of the Quality and Data Ethics Secretariat since 2019.

Keven Bosa works for the Data Science Division at Statistics Canada as the chief of the Quality and Methods Section. He has a Master's Degree in Statistics.

Miguel da Costa e Silva works for the Data Ethics Secretariat at Statistics Canada, within the Quality Secretariat. He has a Master's Degree in Philosophy.

Raphaël Duteau worked for Statistics Canada's Canadian housing statistics program as an economist. He now works for Employment and Social Development Canada as a principal policy analyst for the Data science division. He has a Master's degree in Applied Economics.

Loic Muhirwa is a methodologist working in the Data Science Division at Statistics Canada. He is also a Master's student at the University of Ottawa where he studies machine learning applications to medical imaging.

Dr. David Robichaud is an associate professor at the University of Ottawa, where he teaches moral and political philosophy. He works mainly on the nature of trust, on linguistic justice and on contractarianism.

Workshop 2

Overview of Applied Data Science use cases at Statistics Canada

Abstract

New alternative data sources are already showing many benefits, including: providing faster and timelier products, reducing response burden on households and businesses, producing more accurate results and lowering costs. This is fundamentally changing the way statistical agencies operate. Many of these new opportunities require the use of machine learning methods and this workshop aims to give an overview of applied Data Science use cases at Statistics Canada to showcase these opportunities.

During the workshop, we will start by showcasing examples of Machine Learning modelling for COVID-19, including agent based modeling, and reinforcement learning, the use of NLP (Natural Language Processing) techniques for classification of Census comments, PDF information extraction and OCR (Open source optical character recognition) information extraction, use of data visualisation tools in data science applications, satellite image processing using ML techniques, and data engineering techniques related to building data pipelines for statistical programs.

Biography

Monica Pickard is a chief in the Data Science Division, Product Delivery Section within Statistics Canada. Monica manages multiple projects which include a variety of machine learning algorithms in the area of unstructured text classification, satellite images classification, structured and unstructured PDF extraction and classification. Many of these projects involve large data sets. Prior to joining the Data Science Division in Statistics Canada, Monica worked as an Enterprise Portfolio Manager, acting as a Statistics Canada focal point for the 25 major corporations within her portfolio. Monica has been working for Statistics Canada for the past 12 years and prior to that in Europe in the private Sector as an economist. Monica has an M.Sc. in Agriculture Economics from McGill University.

Anurag Bejju is a Senior Data Scientist in the Data Science Division at Statistics Canada. He has completed his masters in computer science from Simon Fraser University with specialization in Big Data. He has been leveraging state-of-the-art machine learning algorithms to research, develop and implement new methods and techniques that can effectively extract information, such as text, numerical tables and images directly from unstructured data sources (like PDF, images or scanned electronic documents). His contributions in this area has a positive impact on the ability of the organization to carry out its mandates and improve the overall quality, efficiency and productivity of the services and products that are offered.

Nicholas Denis is a senior data scientist at the Data Science Division of Statistics Canada. He holds a M.Sc. Mathematics from the University of Ottawa where he studied Reinforcement Learning. He has published papers on theoretical and applied machine learning and presented at conferences including Neural Information Processing Systems (NeurIPS) and Canadian Artificial Intelligence Conference (CAIC).

Sayema Mashhadi works as a Data Scientist at Statistics Canada. She has completed her Masters in Electrical and Computer Engineering from University of Waterloo and has work experience in the field of Natural Language Processing, Machine Learning and Information Extraction.
Andrés Solís Montero is a senior data scientist at Statistics Canada with a Ph.D. in Computer Science, specializing in Computer Vision, Image Processing, and Machine Learning. Published author with over 1000 citations, and technical reviewer for more than nine years. He is a versatile data scientist with a broad knowledge of Software Development Life Cycles. With over 15 years of full-stack software development, he has managed and leads multiple machine learning products from conception to delivery.

Shirin Roshanafshar is a Data Science Product Manager in the Data Science Division of Statistics Canada. Shirin hold an M.Sc. Statistics from Carleton University and has a passion for data and analytics. Shirin manages a group of data scientists and numerous data science projects in various areas such as Machine Learning, data analytics, Natural Language Processing and automation.

Nikhil Widhani is a Data Scientist at Data Science Division (DScD) of Statistics Canada. He graduated from university of Ottawa with a Masters Degree in Electronic Business Technologies and holds a Bachelor of Science degree in Computer Science. His experience spans across with Data Analysis, Data Engineering, NLP and Web Development. Previously he has worked at Environment and Climate Change Canada on variety of projects. Currently he is working on Information Extraction and Data Visualisation related projects.

Joanne Yoon is a Senior Data Scientist at Statistics Canada and organizes the interdepartmental community of practice on NLP. She completed her Bachelors in Software Systems and Professional Master of Science in Big Data both in Simon Fraser University.

Workshop 3

A Data Science Approach to Official Statistics Estimation: Leveraging the Power of Machine Learning Models
Kelly McConville, Reed College, Portland, OR

Abstract

Survey estimation is the bread-and-butter of many statistical agencies. As technological and statistical advances provide both new data sources and new modeling techniques, estimation procedures must adapt to accommodate these advances. Effectively combining data collected under a complex sampling design with new sources of auxiliary data has the potential to greatly increase the efficiency of our estimators. Luckily, how to best leverage these multiple data sources has been a vibrant area of recent survey research.

This workshop will introduce participants to one modern, model-assisted approach to survey estimation, where predictive models serve as the key link between the survey data and auxiliary data. This method will cover a broad class of predictive models, including generalized linear models, regularized (elastic net) regression, and regression trees. The workshop will also include demonstrations of how to fit these estimators using the statistical software R. R Markdown files with the relevant code will be provided so that participants can actively follow along with the demonstrations. Prior R experience is encouraged but not required.

Biography: Kelly McConville

Kelly McConville is an Associate Professor of Statistics at Reed College in Portland, OR. Her methodological research involves incorporating novel modeling techniques into survey estimators. She has active collaborations with the US Bureau of Labor Statistics and the US Forest Service Forest Inventory and Analysis Program and runs the Reed Forestry Data Science Research Lab. In addition to her regular teaching duties, she has taught several continuing education short courses, webinars, and workshops on R and various data science and statistics topics.

Date modified: