Survey design

Skip to main content
Skip to footer

Language selection

Français

Search and menus

Search and menus

Search

Skip to filters. View results.

Results

All (37)

All (37) (0 to 10 of 37 results)

1. Children born into vulnerability: Challenges encountered in a Quebec longitudinal survey Archived
Articles and reports: 11-522-X202200100010
Description: Growing Up in Québec is a longitudinal population survey that began in the spring of 2021 at the Institut de la statistique du Québec. Among the children targeted by this longitudinal follow-up, some will experience developmental difficulties at some point in their lives. Those same children often have characteristics associated with higher sample attrition (low-income family, parents with a low level of education). This article describes the two main challenges we encountered when trying to ensure sufficient representativeness of these children, in both the overall results and the subpopulation analyses.
Release date: 2024-03-25
2. Model-based stratification of payment populations in Medicare integrity investigations
Articles and reports: 12-001-X202300200001
Description: When a Medicare healthcare provider is suspected of billing abuse, a population of payments X made to that provider over a fixed timeframe is isolated. A certified medical reviewer, in a time-consuming process, can determine the overpayment Y = X - (amount justified by the evidence) associated with each payment. Typically, there are too many payments in the population to examine each with care, so a probability sample is selected. The sample overpayments are then used to calculate a 90% lower confidence bound for the total population overpayment. This bound is the amount demanded for recovery from the provider. Unfortunately, classical methods for calculating this bound sometimes fail to provide the 90% confidence level, especially when using a stratified sample.
In this paper, 166 redacted samples from Medicare integrity investigations are displayed and described, along with 156 associated payment populations. The 7,588 examined (Y, X) sample pairs show (1) Medicare audits have high error rates: more than 76% of these payments were considered to have been paid in error; and (2) the patterns in these samples support an “All-or-Nothing” mixture model for (Y, X) previously defined in the literature. Model-based Monte Carlo testing procedures for Medicare sampling plans are discussed, as well as stratification methods based on anticipated model moments. In terms of viability (achieving the 90% confidence level) a new stratification method defined here is competitive with the best of the many existing methods tested and seems less sensitive to choice of operating parameters. In terms of overpayment recovery (equivalent to precision) the new method is also comparable to the best of the many existing methods tested. Unfortunately, no stratification algorithm tested was ever viable for more than about half of the 104 test populations.
Release date: 2024-01-03
3. Sample designs and estimators for multimode surveys with face-to-face data collection
Articles and reports: 12-001-X202300200006
Description: Survey researchers are increasingly turning to multimode data collection to deal with declines in survey response rates and increasing costs. An efficient approach offers the less costly modes (e.g., web) followed with a more expensive mode for a subsample of the units (e.g., households) within each primary sampling unit (PSU). We present two alternatives to this traditional design. One alternative subsamples PSUs rather than units to constrain costs. The second is a hybrid design that includes a clustered (two-stage) sample and an independent, unclustered sample. Using a simulation, we demonstrate the hybrid design has considerable advantages.
Release date: 2024-01-03
4. Physician experiences during the COVID-19 pandemic in the United States: Adapting an annual survey to assess pandemic-related challenges Archived
Articles and reports: 11-522-X202100100007
Description: The National Center for Health Statistics (NCHS) annually administers the National Ambulatory Medical Care Survey (NAMCS) to assess practice characteristics and ambulatory care provided by office-based physicians in the United States, including interviews with sampled physicians. After the onset of the COVID-19 pandemic, NCHS adapted NAMCS methodology to assess the impacts of COVID-19 on office-based physicians, including: shortages of personal protective equipment; COVID-19 testing in physician offices; providers testing positive for COVID-19; and telemedicine use during the pandemic. This paper describes challenges and opportunities in administering the 2020 NAMCS and presents key findings regarding physician experiences during the COVID-19 pandemic.
Key Words: National Ambulatory Medical Care Survey (NAMCS); Office-based physicians; Telemedicine; Personal protective equipment.
Release date: 2021-10-22
5. An optimisation algorithm applied to the one-dimensional stratification problem
Articles and reports: 12-001-X201900200006
Description:
This paper presents a new algorithm to solve the one-dimensional optimal stratification problem, which reduces to just determining stratum boundaries. When the number of strata H and the total sample size n are fixed, the stratum boundaries are obtained by minimizing the variance of the estimator of a total for the stratification variable. This algorithm uses the Biased Random Key Genetic Algorithm (BRKGA) metaheuristic to search for the optimal solution. This metaheuristic has been shown to produce good quality solutions for many optimization problems in modest computing times. The algorithm is implemented in the R package stratbr available from CRAN (de Moura Brito, do Nascimento Silva and da Veiga, 2017a). Numerical results are provided for a set of 27 populations, enabling comparison of the new algorithm with some competing approaches available in the literature. The algorithm outperforms simpler approximation-based approaches as well as a couple of other optimization-based approaches. It also matches the performance of the best available optimization-based approach due to Kozak (2004). Its main advantage over Kozak’s approach is the coupling of the optimal stratification with the optimal allocation proposed by de Moura Brito, do Nascimento Silva, Silva Semaan and Maculan (2015), thus ensuring that if the stratification bounds obtained achieve the global optimal, then the overall solution will be the global optimum for the stratification bounds and sample allocation.
Release date: 2019-06-27
6. Using balanced sampling in creel surveys Archived
Articles and reports: 12-001-X201800254954
Description:
These last years, balanced sampling techniques have experienced a recrudescence of interest. They constrain the Horvitz Thompson estimators of the totals of auxiliary variables to be equal, at least approximately, to the corresponding true totals, to avoid the occurrence of bad samples. Several procedures are available to carry out balanced sampling; there is the cube method, see Deville and Tillé (2004), and an alternative, the rejective algorithm introduced by Hájek (1964). After a brief review of these sampling methods, motivated by the planning of an angler survey, we investigate using Monte Carlo simulations, the survey designs produced by these two sampling algorithms.
Release date: 2018-12-20
7. Integer programming formulations applied to optimal allocation in stratified sampling Archived
Articles and reports: 12-001-X201500214249
Description:
The problem of optimal allocation of samples in surveys using a stratified sampling plan was first discussed by Neyman in 1934. Since then, many researchers have studied the problem of the sample allocation in multivariate surveys and several methods have been proposed. Basically, these methods are divided into two classes: The first class comprises methods that seek an allocation which minimizes survey costs while keeping the coefficients of variation of estimators of totals below specified thresholds for all survey variables of interest. The second aims to minimize a weighted average of the relative variances of the estimators of totals given a maximum overall sample size or a maximum cost. This paper proposes a new optimization approach for the sample allocation problem in multivariate surveys. This approach is based on a binary integer programming formulation. Several numerical experiments showed that the proposed approach provides efficient solutions to this problem, which improve upon a ‘textbook algorithm’ and can be more efficient than the algorithm by Bethel (1985, 1989).
Release date: 2015-12-17
8. A nonparametric method to generate synthetic populations to adjust for complex sampling design features Archived
Articles and reports: 12-001-X201400114003
Description:
Outside of the survey sampling literature, samples are often assumed to be generated by simple random sampling process that produces independent and identically distributed (IID) samples. Many statistical methods are developed largely in this IID world. Application of these methods to data from complex sample surveys without making allowance for the survey design features can lead to erroneous inferences. Hence, much time and effort have been devoted to develop the statistical methods to analyze complex survey data and account for the sample design. This issue is particularly important when generating synthetic populations using finite population Bayesian inference, as is often done in missing data or disclosure risk settings, or when combining data from multiple surveys. By extending previous work in finite population Bayesian bootstrap literature, we propose a method to generate synthetic populations from a posterior predictive distribution in a fashion inverts the complex sampling design features and generates simple random samples from a superpopulation point of view, making adjustment on the complex data so that they can be analyzed as simple random samples. We consider a simulation study with a stratified, clustered unequal-probability of selection sample design, and use the proposed nonparametric method to generate synthetic populations for the 2006 National Health Interview Survey (NHIS), and the Medical Expenditure Panel Survey (MEPS), which are stratified, clustered unequal-probability of selection sample designs.
Release date: 2014-06-27
9. Multi-objective optimisation for optimum allocation in multivariate stratified sampling Archived
Articles and reports: 12-001-X200800210762
Description:
This paper considers the optimum allocation in multivariate stratified sampling as a nonlinear matrix optimisation of integers. As a particular case, a nonlinear problem of the multi-objective optimisation of integers is studied. A full detailed example including some of proposed techniques is provided at the end of the work.
Release date: 2008-12-23
10. The General Social Survey: New Data Overview Archived
Surveys and statistical programs – Documentation: 89-631-X
Description:
This report highlights the latest developments and rationale behind recent cycles of the General Social Survey (GSS). Starting with an overview of the GSS mandate and historic cycle topics, we then focus on two recent cycles related to families in Canada: Family Transitions (2006) and Family, Social Support and Retirement (2007). Finally, we give a summary of what is to come in the 2008 GSS on Social Networks, and describe a special project to mark 'Twenty Years of GSS'.
The survey collects data over a twelve month period from the population living in private households in the 10 provinces. For all cycles except Cycles 16 and 21, the population aged 15 and older has been sampled. Cycles 16 and 21 sampled persons aged 45 and older.
Cycle 20 (GSS 2006) is the fourth cycle of the GSS to collect data on families (the first three cycles on the family were in 1990, 1995 and 2001). Cycle 20 covers much the same content as previous cycles on families with some sections revised and expanded. The data enable analysts to measure conjugal and fertility history (chronology of marriages, common-law unions, and children), family origins, children's home leaving, fertility intentions, child custody as well as work history and other socioeconomic characteristics. Questions on financial support agreements or arrangements (for children and the ex-spouse or ex-partner) for separated and divorced families have been modified. Also, sections on social networks, well-being and housing characteristics have been added.
Release date: 2008-05-27

Data (0)

Data (0) (0 results)

No content available at this time.

Analysis (35)

Analysis (35) (0 to 10 of 35 results)

1. Children born into vulnerability: Challenges encountered in a Quebec longitudinal survey Archived
Articles and reports: 11-522-X202200100010
Description: Growing Up in Québec is a longitudinal population survey that began in the spring of 2021 at the Institut de la statistique du Québec. Among the children targeted by this longitudinal follow-up, some will experience developmental difficulties at some point in their lives. Those same children often have characteristics associated with higher sample attrition (low-income family, parents with a low level of education). This article describes the two main challenges we encountered when trying to ensure sufficient representativeness of these children, in both the overall results and the subpopulation analyses.
Release date: 2024-03-25
2. Model-based stratification of payment populations in Medicare integrity investigations
Articles and reports: 12-001-X202300200001
Description: When a Medicare healthcare provider is suspected of billing abuse, a population of payments X made to that provider over a fixed timeframe is isolated. A certified medical reviewer, in a time-consuming process, can determine the overpayment Y = X - (amount justified by the evidence) associated with each payment. Typically, there are too many payments in the population to examine each with care, so a probability sample is selected. The sample overpayments are then used to calculate a 90% lower confidence bound for the total population overpayment. This bound is the amount demanded for recovery from the provider. Unfortunately, classical methods for calculating this bound sometimes fail to provide the 90% confidence level, especially when using a stratified sample.
In this paper, 166 redacted samples from Medicare integrity investigations are displayed and described, along with 156 associated payment populations. The 7,588 examined (Y, X) sample pairs show (1) Medicare audits have high error rates: more than 76% of these payments were considered to have been paid in error; and (2) the patterns in these samples support an “All-or-Nothing” mixture model for (Y, X) previously defined in the literature. Model-based Monte Carlo testing procedures for Medicare sampling plans are discussed, as well as stratification methods based on anticipated model moments. In terms of viability (achieving the 90% confidence level) a new stratification method defined here is competitive with the best of the many existing methods tested and seems less sensitive to choice of operating parameters. In terms of overpayment recovery (equivalent to precision) the new method is also comparable to the best of the many existing methods tested. Unfortunately, no stratification algorithm tested was ever viable for more than about half of the 104 test populations.
Release date: 2024-01-03
3. Sample designs and estimators for multimode surveys with face-to-face data collection
Articles and reports: 12-001-X202300200006
Description: Survey researchers are increasingly turning to multimode data collection to deal with declines in survey response rates and increasing costs. An efficient approach offers the less costly modes (e.g., web) followed with a more expensive mode for a subsample of the units (e.g., households) within each primary sampling unit (PSU). We present two alternatives to this traditional design. One alternative subsamples PSUs rather than units to constrain costs. The second is a hybrid design that includes a clustered (two-stage) sample and an independent, unclustered sample. Using a simulation, we demonstrate the hybrid design has considerable advantages.
Release date: 2024-01-03
4. Physician experiences during the COVID-19 pandemic in the United States: Adapting an annual survey to assess pandemic-related challenges Archived
Articles and reports: 11-522-X202100100007
Description: The National Center for Health Statistics (NCHS) annually administers the National Ambulatory Medical Care Survey (NAMCS) to assess practice characteristics and ambulatory care provided by office-based physicians in the United States, including interviews with sampled physicians. After the onset of the COVID-19 pandemic, NCHS adapted NAMCS methodology to assess the impacts of COVID-19 on office-based physicians, including: shortages of personal protective equipment; COVID-19 testing in physician offices; providers testing positive for COVID-19; and telemedicine use during the pandemic. This paper describes challenges and opportunities in administering the 2020 NAMCS and presents key findings regarding physician experiences during the COVID-19 pandemic.
Key Words: National Ambulatory Medical Care Survey (NAMCS); Office-based physicians; Telemedicine; Personal protective equipment.
Release date: 2021-10-22
5. An optimisation algorithm applied to the one-dimensional stratification problem
Articles and reports: 12-001-X201900200006
Description:
This paper presents a new algorithm to solve the one-dimensional optimal stratification problem, which reduces to just determining stratum boundaries. When the number of strata H and the total sample size n are fixed, the stratum boundaries are obtained by minimizing the variance of the estimator of a total for the stratification variable. This algorithm uses the Biased Random Key Genetic Algorithm (BRKGA) metaheuristic to search for the optimal solution. This metaheuristic has been shown to produce good quality solutions for many optimization problems in modest computing times. The algorithm is implemented in the R package stratbr available from CRAN (de Moura Brito, do Nascimento Silva and da Veiga, 2017a). Numerical results are provided for a set of 27 populations, enabling comparison of the new algorithm with some competing approaches available in the literature. The algorithm outperforms simpler approximation-based approaches as well as a couple of other optimization-based approaches. It also matches the performance of the best available optimization-based approach due to Kozak (2004). Its main advantage over Kozak’s approach is the coupling of the optimal stratification with the optimal allocation proposed by de Moura Brito, do Nascimento Silva, Silva Semaan and Maculan (2015), thus ensuring that if the stratification bounds obtained achieve the global optimal, then the overall solution will be the global optimum for the stratification bounds and sample allocation.
Release date: 2019-06-27
6. Using balanced sampling in creel surveys Archived
Articles and reports: 12-001-X201800254954
Description:
These last years, balanced sampling techniques have experienced a recrudescence of interest. They constrain the Horvitz Thompson estimators of the totals of auxiliary variables to be equal, at least approximately, to the corresponding true totals, to avoid the occurrence of bad samples. Several procedures are available to carry out balanced sampling; there is the cube method, see Deville and Tillé (2004), and an alternative, the rejective algorithm introduced by Hájek (1964). After a brief review of these sampling methods, motivated by the planning of an angler survey, we investigate using Monte Carlo simulations, the survey designs produced by these two sampling algorithms.
Release date: 2018-12-20
7. Integer programming formulations applied to optimal allocation in stratified sampling Archived
Articles and reports: 12-001-X201500214249
Description:
The problem of optimal allocation of samples in surveys using a stratified sampling plan was first discussed by Neyman in 1934. Since then, many researchers have studied the problem of the sample allocation in multivariate surveys and several methods have been proposed. Basically, these methods are divided into two classes: The first class comprises methods that seek an allocation which minimizes survey costs while keeping the coefficients of variation of estimators of totals below specified thresholds for all survey variables of interest. The second aims to minimize a weighted average of the relative variances of the estimators of totals given a maximum overall sample size or a maximum cost. This paper proposes a new optimization approach for the sample allocation problem in multivariate surveys. This approach is based on a binary integer programming formulation. Several numerical experiments showed that the proposed approach provides efficient solutions to this problem, which improve upon a ‘textbook algorithm’ and can be more efficient than the algorithm by Bethel (1985, 1989).
Release date: 2015-12-17
8. A nonparametric method to generate synthetic populations to adjust for complex sampling design features Archived
Articles and reports: 12-001-X201400114003
Description:
Outside of the survey sampling literature, samples are often assumed to be generated by simple random sampling process that produces independent and identically distributed (IID) samples. Many statistical methods are developed largely in this IID world. Application of these methods to data from complex sample surveys without making allowance for the survey design features can lead to erroneous inferences. Hence, much time and effort have been devoted to develop the statistical methods to analyze complex survey data and account for the sample design. This issue is particularly important when generating synthetic populations using finite population Bayesian inference, as is often done in missing data or disclosure risk settings, or when combining data from multiple surveys. By extending previous work in finite population Bayesian bootstrap literature, we propose a method to generate synthetic populations from a posterior predictive distribution in a fashion inverts the complex sampling design features and generates simple random samples from a superpopulation point of view, making adjustment on the complex data so that they can be analyzed as simple random samples. We consider a simulation study with a stratified, clustered unequal-probability of selection sample design, and use the proposed nonparametric method to generate synthetic populations for the 2006 National Health Interview Survey (NHIS), and the Medical Expenditure Panel Survey (MEPS), which are stratified, clustered unequal-probability of selection sample designs.
Release date: 2014-06-27
9. Multi-objective optimisation for optimum allocation in multivariate stratified sampling Archived
Articles and reports: 12-001-X200800210762
Description:
This paper considers the optimum allocation in multivariate stratified sampling as a nonlinear matrix optimisation of integers. As a particular case, a nonlinear problem of the multi-objective optimisation of integers is studied. A full detailed example including some of proposed techniques is provided at the end of the work.
Release date: 2008-12-23
10. Estimating sample size for complex surveys: building consensus in an environment of multiple hypotheses, multiple stakeholders and limited budgets Archived
Articles and reports: 11-522-X200600110441
Description:
How does one efficiently estimate sample size while building concensus among multiple investigators for multi-purpose projects? We present a template using common spreadsheet software to provide estimates of power, precision, and financial costs under varying sampling scenarios, as used in development of the Ontario Tobacco Survey. In addition to cost estimates, complex sample size formulae were nested within a spreadsheet to determine power and precision, incorporating user-defined design effects and loss-to-followup. Common spreadsheet software can be used in conjunction with complex formulae to enhance knowledge exchange between the methodologists and stakeholders; in effect demystifying the "sample size black box".
Release date: 2008-03-17

Reference (2)

Reference (2) ((2 results))

1. The General Social Survey: New Data Overview Archived
Surveys and statistical programs – Documentation: 89-631-X
Description:
This report highlights the latest developments and rationale behind recent cycles of the General Social Survey (GSS). Starting with an overview of the GSS mandate and historic cycle topics, we then focus on two recent cycles related to families in Canada: Family Transitions (2006) and Family, Social Support and Retirement (2007). Finally, we give a summary of what is to come in the 2008 GSS on Social Networks, and describe a special project to mark 'Twenty Years of GSS'.
The survey collects data over a twelve month period from the population living in private households in the 10 provinces. For all cycles except Cycles 16 and 21, the population aged 15 and older has been sampled. Cycles 16 and 21 sampled persons aged 45 and older.
Cycle 20 (GSS 2006) is the fourth cycle of the GSS to collect data on families (the first three cycles on the family were in 1990, 1995 and 2001). Cycle 20 covers much the same content as previous cycles on families with some sections revised and expanded. The data enable analysts to measure conjugal and fertility history (chronology of marriages, common-law unions, and children), family origins, children's home leaving, fertility intentions, child custody as well as work history and other socioeconomic characteristics. Questions on financial support agreements or arrangements (for children and the ex-spouse or ex-partner) for separated and divorced families have been modified. Also, sections on social networks, well-being and housing characteristics have been added.
Release date: 2008-05-27
2. Summit of the Americas Regional Education Indicators Project: data quality challenges Archived
Surveys and statistical programs – Documentation: 11-522-X20010016293
Description:
This paper discusses in detail issues dealing with the technical aspects of designing and conducting surveys. It is intended for an audience of survey methodologists.
This paper presents the Second Summit of the Americas Regional Education Indicators Project (PRIE), whose basic goal is to develop a set of comparable indicators for the Americas. This project is led by the Ministry of Education of Chile and has been developed in response to the countries' needs to improve their information systems and statistics. The countries need to construct reliable and relevant indicators to support decisions in education, both within their individual countries and the region as a whole. The first part of the paper analyses the importance of statistics and indicators in supporting educational policies and programs, and describes the present state of the information and statistics systems in these countries. It also discusses the major problems faced by the countries and reviews the countries' experiences in participating in other education indicators' projects or programs, such as the INES Program, WEI Project, MERCOSUR and CREMIS. The second part of the paper examines PRIE's technical co-operation program, its purpose and implementation. The second part also emphasizes how technical co-operation responds to the needs of the countries, and supports them in filling in the gaps in available and reliable data.
Release date: 2002-09-12

Report a problem or mistake on this page

Date modified:: 2024-04-25