Workshops 2022

Workshops will be held on November 2, from 10am to 2pm, Eastern Daylight Time (EDT): UTC-4

Wednesday, November 2, 2022

Workshop 1

Disaggregating Racial-Ethnic Classification Systems to Improve Data Equity (Available only in English)
Dr. Tara Becker, UCLA


The most commonly used classification systems that are used to code race-ethnicity obscure the wide variation in racial-ethnic experiences within these broad categories. This workshop will provide attendees with a more detailed understanding of the impact of data collection, coding, and tabulation on efforts to disaggregate racial-ethnic data into more granular categories in order to provide greater insight into the diversity within these groups. Specifically, it will provide an overview of the ways in which federal data collection guidelines influence the collection and weighting of racial-ethnic data in the United States, methods that have been used to expand upon these guidelines to collect and tabulate more granular data, when such expansions may be warranted, and the effects of data collection methodology on the representativeness of data from small racial-ethnic populations. In addition to discussing how, when, and why one should consider disaggregating racial-ethnic data, it will consider—on a conceptual, rather than statistical level—the ways in which methodological decisions, such as the language(s) in which a survey is administered, the method(s) used to oversample racial-ethnic groups, and weighting decisions can affect the generalizability of statistical estimates derived from this data and influence what we know about more granular racial-ethnic populations.


Dr. Tara Becker is a Program Officer for the Committees on National Statistics and Population in the Division of Behavioral and Social Sciences and Education at the National Academy of Sciences, Engineering, and Medicine. She serves as the Study Director for a study examining the measurement of sex, gender identity, and sexual orientation and another examining the older workforce and employment at older ages. She has served as a Program Officer for a study examining the well-being of LGBTQI+ individuals and another examining high and rising working age mortality rates in the United States. Before joining the National Academies, she was a Senior Public Administration Analyst and Senior Statistician for the California Health Interview Survey at the Center for Health Policy Research at the University of California, Los Angeles, where she conducted research on disparities in health insurance coverage and access to health care, the disaggregation of racial/ethnic data, and survey methodology and data quality. Prior to this, she was a postdoctoral fellow in the Department of Health Policy and Management at the University of California, Los Angeles and a Biostatistician at the University of Wisconsin, Madison Department of Biostatistics and Medical Informatics. She was trained as a demographer and has a B.A. in sociology and mathematics, an M.S. in sociology, an M.S. in statistics, and a Ph.D. in sociology from the University of Wisconsin, Madison.

Workshop 2

Design and Analysis of Survey Data in Python – (Introduction to Python, as well as sampling and estimation using Python) (Available only in French)
Dr. Mamadou Diallo, Samplics LLC


Survey samples are often selected using predefined probabilistic methods from finite populations. Complex sampling designs are used to facilitate fieldwork and keep costs under control (e.g., stratification, clustering, stage sampling, etc.), resulting in samples with unequal selection probabilities. Techniques such as sample selection, weight adjustment, and sample analysis need to account for the complexity of the sampling design.

Until the development of samplics, Python users did not have a comprehensive library for designing and analyzing complex survey samples. The Python package samplics provides modules for sample size calculation, sample selection and weighting, population parameter estimation including small area estimation.

This workshop will provide a tour of samplics for survey statisticians. The workshop is designed to be accessible to statisticians with little to no Python experience. We will briefly review Python concepts and syntax before diving into the statistical content. The second part of the workshop will demonstrate how to use samplics in the design phase of a survey for sample size calculation and sample selection. We will also illustrate the samplics APIs for adjusting sample weights to account for non-response and calibration. The last part of the workshop will focus on estimating population parameters such as mean and ratio, including domain estimation and small area estimation examples.


Dr. Mamadou S. Diallo has more than 15 years of experience in survey sampling, small area estimation and other topics in statistics. He is the founder of Samplics LLC and creator of the Python Package Samplics. Samplics supports organizations to develop data systems and machine learning solutions to optimize the use of data for decision making. Samplics is also committed to advancing data tools and skills for official statistics.

Previously, he worked for UNICEF as the immunization data team lead at the global level. Mamadou worked as a senior survey statistician at Westat, Rockville MD where he supported numerous national and international surveys. He also worked for Statistics Canada where he contributed to the 2006 census under-coverage survey and to the Canadian Community Health Survey (CCHS). Mamadou holds a PhD in Statistics from Carleton University, Ottawa, Canada where he conducted research on small area estimation under the direction of Pr. J.N.K. Rao. He also holds a MSc in Statistics from Laval University, Quebec, Canada, and a BSc (equivalent) in Mathematics from Claude Bernard University, Lyon, France.

Workshop 3

Methods for Multiple Frame Surveys (Available only in English)
Dr. Fulvia Mecatti, University of Milano-Bicocca


In a Multiple Frame (MF) survey, independent samples are selected from two or more separate frames, usually overlapping each other. MF surveys have been used since the 1970s for their potential to cope with old issues such as reducing survey costs and improving frame coverage. Newer challenges, such as declining response rates, oversampling difficult-to-reach or rare population segments of interest, the "leave no one behind" principle overarching the Sustainable Development Goals of the UN 2030 Agenda, along with emerging data sources, collection modes, and integration needs, invite one to investigate modular multi-frame surveys with a refreshed view.


Dr. Fulvia Mecatti has a PhD in Methodological Statistics from University of Trento and is professor of Statistics at University of Milano-Bicocca. She has a long experience in teaching Statistics courses in undergraduate social sciences programs, to Master and PhD students as well as to non-technical publics. She is former director of the PhD program in Statistics at University of Milano-Bicocca and former president of S2G-Survey Sampling Group of the Italian Statistical Society. Her main research interests are in Sampling Theory, finite population Resampling, Causal Inference in biomedical observational studies and Gender Statistics. She has served as facilitator with WHO Tuberculosis Monitoring & Evaluation team and is co-author in several WHO publications, including the 2015 guidelines "TB prevalence surveys: a handbook". Since 2021 she is consultant with UN Inter-Secretariat Working Group on Household Surveys and with UN Women, contributing in the guidelines on "Sampling to Leave no One Behind" (to appear, online).

She has a longstanding relation with Statistics Canada. She has been visiting J.N.K. Rao in 2003, and since then, visiting and collaborating also with David Haziza on variance estimation and with A.C. Singh on Multiple Frame surveys.

Date modified: