Health Reports
Development of a population-based microsimulation model of body mass index

by Deirdre Hennessy, Rochelle Garner, William M. Flanagan, Ron Wall and Claude Nadeau

Release date: June 21, 2017

In both the developed and the developing world, obesity has increased, with global prevalence more than doubling since 1980 to 13%.Note 1Note 2Note 3Note 4Note 5 This dramatic change has popularly been labelled the “obesity epidemic.”

Obesity is a risk factor for several diseases, including diabetes, osteoarthritis, cardiovascular disease, kidney diseases, and certain cancers. As the prevalence of obesity rises, so will the prevalence of these conditions,Note 6Note 7Note 8Note 9 which are associated with substantial morbidity, increased risk of mortality, and greater economic costs to individuals and society.Note 10

As a consequence, obesity surveillance has become an international priority.Note 11 Halting the rising prevalence of obesity has been established as a global goal by the World Health Organization.Note 11 To understand the implications for population health, some countries have constructed models to project body mass index (BMI) and obesity trends over time.Note 12Note 13

Microsimulation modelling is particularly useful for studying BMI trends, as it can simultaneously account for population dynamics such as aging, migration, and mortality. As well, the longituninal framework of such models allows BMI to evolve over the life course of simulated individuals, to interact with factors such as physical activity, and to contribute to the risk of multiple diseases.Note 14 While typically used to evaluate programs such as taxation and pension policy,Note 15 microsimulation modelling is increasingly used in the health arena.Note 16Note 17Note 18Note 19 It offers a method of comparing the consequences of health policy changes in advance of implementation.

The POpulation HEalth Model (POHEM), developed at Statistics Canada,Note 14Note 20Note 21 has been used to quantify the health and economic burden of osteoarthritis,Note 22Note 23Note 24 to investigate trends in risk factors for cardiovascular diseases,Note 25 and to project trends in BMI and physical activity.Note 21Note 26 POHEM-BMI was a collaboration between the Public Health Agency of Canada and Statistics Canada.

This overview of POHEM-BMI describes the development of BMI prediction models for adults and of childhood BMI history, and compares projected BMI estimates with estimates from nationally representative survey data to establish validity. Longer-term projections of adult BMI are presented to demonstrate the utility of POHEM-BMI for tracking future risk factor and disease trends to support evidence-based policy making.

POpulation HEalth Model: Overview

POHEM is a continuous-time, Monte Carlo microsimulation tool in which the basic unit of analysis is the individual. POHEM integrates data distributions and equations derived from nationally representative cross-sectional and longitudinal surveys, vital statistics, and disease registries.

Producing projections from POHEM entails several steps.Note 20 The initial population, comprised of adults aged 20 or older, comes from the cross-sectional 2001 Canadian Community Health Survey (CCHS). Initial values for BMI and other characteristics are updated periodically, based on predictive algorithms and transition matrices. As individuals age and their risk factors are updated, their risk of disease changes. Simultaneously, as individuals age and die, the population demographics change, based on observed and projected demographic data.Note 27 The modelled estimates of risk factor or disease prevalence are validated,Note 28 and if necessary, calibrated. Finally, baseline and counterfactual projections for risk factor or disease prevalence are produced.

POpulation HEalth Model: BMI modules

In the adult model, the initial BMI values come from the 2001 CCHS. Subsequent values are determined by multivariate BMI prediction models constructed using the longitudinal National Population Health Survey (NPHS).Note 29 Both self-reported and measured BMI can be projected. Measured BMI is derived from self-reported BMI using a conversion algorithm. Within POHEM-BMI, self-reported BMI is a predictor of multiple diseases, health status, and mortality. As well, it is involved in estimating other risk factors, including physical activity, cholesterol, and hypertension. Figure 1 displays the predictors in the adult BMI model and the outcomes and other risk factors for which BMI is used as predictor. The BMI and physical activity modules have an interacting relationship―physical activity is used as a predictor of BMI, which, in turn, predicts physical activity. The POHEM-Physical Activity model has been described elsewhere.Note 30

POHEM-BMI does not simulate children. However, for actors born into the simulation at age 20, a “BMI history” that shows what their BMI would have been at ages 6 to 19 is created. This BMI history module is stand-alone in that it allows for static reporting of childhood and adolescent BMI history among 20-year-olds created in any year, but does not interact with other parts of the model. The childhood BMI history model was developed and validated using the 2004 CCHS, which collected height and weight measurements for children aged 2 or older.

Development of a predictive BMI model for adults

Data source

The longitudinal component of the NPHS followed a group of respondents randomly selected in 1994/1995 to be representative of the Canadian household population living in the 10 provinces. Residents of Indian Reserves and Crown lands, health care institutions and some remote areas, and full-time members of the Canadian Forces were excluded. The initial sample of 17,276 respondents was re-interviewed every second year. The design, sample and interview procedures of the NPHS are described elsewhere.Note 29Note 30Note 31

Data from the first seven NPHS cycles (1994/1995 through 2006/2007) were used to derive the predictive model of self-reported BMI change for adults. The longitudinal data showed how sociodemographic and health behaviour characteristics contributed to the evolution of self-reported BMI over time.

Weight, and therefore BMI, tends to be under-reported in surveys.Note 32 To address this limitation, an algorithm was developed to convert self-reported to measured BMI, using data from the 2004 CCHS, which collected both self-reported and measured height and weight for a subsample of approximately 5,000 individuals.

Validation data sources

Validation involved comparisons of POHEM-BMI projections with estimates from four CCHS cycles between 2001 and 2014. The CCHS,Note 33Note 34 a nationally representative cross-sectional survey, has collected self-reported information on BMI, other lifestyle risk factors and chronic diseases at regular intervals since 2001. It covers the non-institutionalized household population aged 12 or older in all provinces and territories, except residents of Indian Reserves and Crown lands, institutions and certain remote areas, and full-time members of the Canadian Forces.

Projections of measured BMI were validated against estimates from the Canadian Health Measures Survey (CHMS),Note 35Note 36Note 37Note 38Note 39 a direct health measures survey. Between March 2007 and December 2013, three CHMS cycles were completed at 49 sites across Canada. The survey provides nationally representative estimates from a sample of Canadians aged 6 to 79 living in private households. The CHMS excludes residents of Indian Reserves or Crown lands, institutions and certain remote regions, and full-time members of the Canadian Forces.

Modelling techniques and model selection

BMI predictive model: Sex-specific, auto-regressive models were constructed to predict adult BMI. Previous BMI values were the main explanatory variables; respondents could have up to four previous BMI measurements from the longitudinal data (NPHS). Depending on the autoregressive order of the model (from 1 to 4), a different set of coefficients for the covariates was used. For illustrative purposes, Appendix Table A shows the coefficients for the first order model. Other covariates were age, physical activity (leisure-time physical activity; biking and walking for transportation and errands; usual-day physical activity), and smoking. As well as smoking and physical activity, variables that captured the individual’s longitudinal trend (downward or upward) in these behaviours were included to more precisely reflect exposure to these risks over time. The BMI predictive model initially included income, education, and geography as potential covariates, but they did not confer additional predictive power, and so were dropped during model selection.

Conversion model: Self-reported BMI was converted to measured BMI using the equation and coefficients in Appendix Table B. In this algorithm, measured BMI is modelled as a function of self-reported BMI, education (less than secondary graduation, secondary graduation, some postsecondary, postsecondary graduation), and age group.

Adjustment for calendar time: When new individuals are born into the simulation at age 20, they “inherit” an initial BMI value from an individual in the starting population (CCHS 2001). Because it may be unrealistic for 20-year-olds in future years to have the same BMI distribution as 20-year-olds in 2001, the ability to alter BMI over time is programmed. Additional analysis of the CCHS from 2001 to 2009 was undertaken to construct an algorithm that adjusts BMI for calendar time separately for males and females. This feature can be switched on or off in POHEM-BMI as needed. The current results were generated without adjusting for calendar time.

Development of a predictive model of childhood BMI history

The predictive model of childhood BMI history is based on measured data from the 2004 CCHS 2.2.Note 33Note 34 The same data source was used to validate the POHEM-BMI output.

To predict childhood BMI history, sex-specific growth curve modelling techniques were used. The model of BMI distribution covers 6- to 19-year-olds and does not include any covariates besides 4 cubic spline functions of age (details about this model are available on request). Although the model was built using cross-sectional data, a sequence of plausible BMI values by age that exhibited a similar correlation structure to that in longitudinal data was sought. Additional analysis using the NPHS revealed that the auto-correlation structure of children’s BMI over time resembled that of an autoregressive moving average (ARMA) model, which meant it exhibited slow exponential decay after a sharper initial decline. Using the ARMA model, correlated quantiles of BMI for children were created.

Integrating predictive models into the POHEM environment

The BMI prediction models were integrated into the POHEM framework by implementing the algorithms in Modgen code.Note 40 Pieces of code that control the population dynamics, risk factors and disease status updates are written as separate modules and compiled into a Microsoft Visual Studio solution that generates an executable file. The POHEM executable uses the parameters, including population counts, mortality rates and equation coefficients, to run the simulation and generate output.

The outputs are a set of pre-programmed tables that show average projected BMI, projected distribution of BMI categories (underweight, normal weight, overweight, and obese), and the distribution of adult weight categories given child weight category. The tables can be customized for any outcome. Because of the modular format, newly developed pieces of code can be integrated with the existing program. Once modules are implemented and validated, additional functionality can be added. For instance, POHEM can perform counterfactual analyses by changing model inputs and then projecting alternative distributions of the outcome of interest.

A set of intervention parameters allows for changes (reductions or increases) to the BMI of the population. These interventions can be targeted by various population characteristics, including year, age group, sex, BMI category, and disease risk. With this facility, it is possible to conduct comparative analysis of different interventions.

Validation framework

Kopec et al. have detailed recommendations for validating disease simulation models,Note 28 the most pertinent of which for this study relate to: parameter quality, computer implementation of the model, and evidence of model performance. The BMI prediction model parameters were derived from statistical analyses of population-based surveys. These data sources (NPHS, CCHS and CHMS) and the methods used to analyse them are well-documented.Note 29Note 30Note 31Note 33Note 34Note 35Note 36Note 37Note 38Note 39 POHEM-BMI was implemented in Modgen code,Note 40 a computer language developed for microsimulation, which has been used to program a number of other models at Statistics Canada and internationally. Evidence for model performance was gathered by comparing the projected outputs to input data (used to develop the models) and to external data (not used to develop the models).

Statistical analysis

The POHEM-BMI projections and validation estimates from the CCHS (various cycles) and CHMS (cycles 1 through 3) were age-standardized to the 2001 adult (20 or older) population in Figure 2 to allow comparisons over time. The data in Figures 3, 4 and 5 are unstandardized. World Health Organization cut-offs for BMI were used to classify adults and children into BMI categories.Note 41Note 42 Variance estimates that account for design effects were calculated using the bootstrap technique for survey-based estimates. Analyses were conducted using SAS 9.3 and Stata 11.


Adult BMI

The predictors (and coefficents) used to model adult BMI are described in Figure 1 and Appendix Table A. After integration of the models into the POHEM framework, projections of self-reported and measured BMI (Figure 2) were validated by comparing simulation outputs to estimates from the CCHS and CHMS, respectively. The projections and estimates agree well, especially for the percentage obese.

POHEM-BMI projections of average self-reported BMI and the distribution of self-reported BMI categories from 2001 to 2030 are shown in Figure 3 for men and in Figure 4 for women. For both sexes, average self-reported BMI is projected to rise by more than one BMI unit between 2001 and 2030. The percentage overweight is projected to remain relatively stable, while the percentage obese increases among both sexes. According to POHEM projections of self-reported BMI, about 59% of the adult population will be overweight/obese by 2030. Men have a higher initial average self-reported BMI than do women, and a greater percentage are projected to become overweight/obese. Also, the increase in obesity among men is projected to be steeper than that among women. Trends are similar in projections of measured BMI, but the total projected prevalence of overweight/obesity is higher, reaching 66% of the adult population by 2030 (data not shown).

History of childhood BMI

Modelled childhood BMI history was validated by comparing POHEM-BMI outputs with data from the 2004 CCHS. The projected childhood BMI history of 20-year-olds in 2015 (who had been 9 years old in 2004) was compared with that of 9-year-olds surveyed by the CCHS in 2004. Figure 5 shows close agreement between the POHEM-BMI projections and CCHS estimates of the percentage of 9-year-olds who were normal weight, overweight or obese.


Based on empirically developed BMI prediction models, by 2030, approximately 59% of Canadian adults will be overweight/obese according to POHEM projections of self-reported BMI (66% according to POHEM projections of measured BMI). These projected increases are comparable to those of the Foresight microsimulation model, which has been used for the United Kingdom (UK), the United States, and other countries,Note 10Note 12Note 43 and to statistical projections (non-microsimulation) for Australia.Note 13 A greater increase in the prevalence of obesity among men than women was also noted for Russia and Poland by the Foresight researchers.Note 44Note 45 However, the increase in the prevalence of overweight/obesity projected by POHEM-BMI is not as steep as the Foresight models, which predict levels as high as 72% for the UK population by 2035Note 9, and 80% for the Irish population by 2030.Note 46 Similar results were reported by Walls et al.Note 13 for Australia, where 72% of the adult population is projected to be overweight/obese by 2025.

The slower rise in overweight/obesity in Canada has also been shown in analyses by the Organization for Economic Co-operation and Development.Note 47 However, it is important to consider how POHEM-BMI differs from other models. POHEM-BMI was constructed with longitudinal self-reported BMI data, whereas Foresight was originally developed using cross-sectional measured BMI data from the UK Health Survey for England.Note 10Note 12Note 43 Self-reported data consistently underestimate measured BMI.Note 32 Although measured BMI data would be the ideal input for POHEM, direct measures are costly to collect. To address this limitation, POHEM-BMI contains a conversion algorithm that allows projections of both self-reported and measured BMI.

The models also differ in their use of covariates: POHEM-BMI includes age, sex, smoking and physical activity as predictors, while Foresight generated cross-sectional BMI parameters for the UK by age, sex, ethnicity, social class, and geographical region.Note 12 As well, BMI and physical activity interact in POHEM-BMI. Although Foresight projects BMI by covariates, it is not a multivariate predictive model in the same way as POHEM-BMI. POHEM-BMI predicts BMI using the covariates. The simulated value of BMI is then used as a predictor of other risk factors and diseases (Figure 1). This method of modelling allows for considerable heterogeneity in risk profiles among individuals,Note 48 so that the life course trajectories of individuals vary in ways that plausibly represent observed data. Also, the multivariate nature of POHEM-BMI offers many intervention points. For instance, co-occuring interventions that target both BMI and physical activity could be implemented, as well as interventions that tackle individual risk factors or diseases.

Notwithstanding the differences between them, the models have similar applications. In population health, policy and program evaluation has generally been undertaken after implementation. Microsimulation projections offer a way to measure the future burden of a risk factor or disease, accounting for aging and other dynamic population changes. Such analysis can determine if risk factor or disease prevalence is increasing, decreasing, or stabilizing. As well, quantification of the size of the problem (in this case, the prevalence of overweight and obesity) is worthwhile.

Information about the future incidence and prevalence of risk factors and disease by age, sex and sociodemographic characteristics is helpful for deploying public health and prevention resources. Using POHEM-BMI or similar models to project an intervention’s potential impact on the prevalence of obesity and obesity-related diseases can provide evidence that the benefits outweigh the costs.


POHEM-BMI should be considered in the context of certain limitations.

First, the estimates and projections are only as good as the input data. Survey data are subject to potential bias resulting from sampling error, incomplete coverage, non-response, and measurement error. Despite Statistics Canada’s efforts to identify and minimize such biases, they may still exist.

Second, as a measure of overweight and obesity, BMI itself is a limitation. BMI cannot account for differences in body composition. The standard cut-offs used to define overweight and obesity may not be appropriate for all ethnicities.Note 49 In addition, POHEM’s BMI predictive models do not consider certain important causal determinants of weight gain, such as diet quality and neighbourhood walkability.


Using POHEM-BMI, it is possible to produce validated projections of overweight and obesity for the Canadian population. Such projections could have important applications for surveillance of risk factor and disease prevalence, as well as for planning and comparative analysis of intervention strategies.


The Science Integration Division of the Public Health Agency of Canada provided financial support for the development of the body mass index module of the POpulation HEalth Model (POHEM). The authors thank all researchers who have used POHEM in their work and have helped refine the model over time.

Date modified: