Essential components of a BioBrowser session

Modgen components

Before using BioBrowser, it is important for the analyst to understand some of the essential components of Modgen.

database (.mdb) files
These files are created by Modgen during the simulation phase of the model.  They contain the raw data necessary to construct the graphical representation created by BioBrowser.  Although the database files can be read by BioBrowser, BioBrowser can never modify the contents of these files.  All BioBrowser sessions begin by opening a pre-existing database file.

dominant actors
These elements are at the core of any Modgen simulation exercise.  Dominant actors are usually persons or households which are created at the beginning of the simulation process and undergo changes to their characteristics as they proceed through their lives.  Dominant actors are defined by their characteristics (states) and by the events which transform their states.

non-dominant actors
Modgen simulates one case at a time where a set of dominant actors undergoes changes to its states.  One possible change to a person actor’s state is a marriage or a common-law union.  When this event has occurred, Modgen generates an appropriate spouse.  This spouse, another person actor, is termed a non-dominant person actor.  Once created, non-dominant actors undergo the same possible events as the dominant actors of the same type.  Non-dominant actors are linked to their dominant actor.

tentative actors
The process of generating a non-dominant actor in Modgen involves generating a sequence of potential candidates.  The candidates who are not chosen are termed tentative actors since they have no links to any of the dominant actors in the model.

states
These elements define the characteristics of the actors over the span of their lifetimes.  Examples of states might include age, employment status, or educational attainment.  States can be scalars or arrays.

Before beginning to use BioBrowser, a database file needs to be created using Modgen. If you want to examine states which are not in the database, a new Modgen simulation must be run and a new database file needs to be created.  A sample database demo(trk).mdb was included with this software package.  For more information on creating new databases in Modgen please refer to the Modgen Developer's Guide or, for a quick overview/refresher, see Appendix: Creating a new Modgen database file.

BioBrowser components

BioBrowser takes the database and creates graphics of the characteristics of the actors.  In addition to the above Modgen concepts, there are other concepts which relate specifically to BioBrowser.

biography (.bbr) files
These files contain the graphical representations which the analyst has created during a BioBrowser session.  The biography files can be created, saved, and edited by the analyst during a BioBrowser session.

display band
The graphical display of a state or linked actor.

filter
The criteria used to narrow or refine the set of actors to be used in a biography..

navigation band
A type of display band which also includes a set of buttons which allows the user to go from the display bands of one actor to another and add new states to the biography.  The buttons resemble the control buttons on the front of a CD player.

Date modified:

The BioBrowser menu and toolbar

Menu commands

The BioBrowser menu bar contains a set of standard menus available in most Microsoft Office applications, as well as some application specific commands.  Some of the same functions may be available as Toolbar buttons or through keyboard equivalents.

Menu Commands

Pop-up menus:  Some commands are only available from pop-up menus or by double-clicking on the chart area over the display bands of the desired state.  Use the right mouse button click to access the pop-up menus. These menus will differ depending on whether or not the state is a simple state or a linked actor. For simple states such as “employed” below, the following commands are available:

Pop-up menu showing the five options available for simple states

For the filter tracking band and linked actors, access is also provided to the navigation band commands, as shown below:

Pop-up menu showing options available for filter tracking bands and for linked actors

The toolbar

The toolbar provides quick access to the most frequently used menu items and commands in the BioBrowser application. Each button is described by a Tool-Tip or status bar description. If you have a small screen at low resolution you may choose not to display the Toolbar. Choose Tools/Options and Click on the View Toolbar Option.

Icon Description Menu equivalent
Icon to create new biography Create new biography File / New
Icon to open saved biography Open saved biography File / Open
Icon to save biography Save biography File / Save

Print

Print active biography File / Print
Icon to copy active biography to cllipboard Copy active biography to clipboard Edit / Copy
Icon to undo last add Undo last add Edit / Undo Last Add
Icon to show or hide grid lines Show or hide grid lines Format / Grid Lines
Icon to show or hide guide lines Show or hide guide lines Format / Guide Lines
Icon to show or hide navigation bands Show or hide navigation bands Format / Navigation Bands
Icon to change background colour Change background colour Format / Background colour
Icon to change chart colour Change chart colour Format / Chart Colour
Icon to invoke BioBrowser Help Invoke BioBrowser Help Help / Contents
Date modified:

Display and output options

Formatting the chart area

When a state is added various defaults for chart presentation are used depending on its Modgen state type.  At present, five chart types are permitted:

table of four different chart types
Line Level Horizontal Bar Point
Illustration of line chart type Illustration of level chart type Illustration of horizontal bar chart type Illustration of point chart type
Event
Illustration of event chart type

BioBrowser recognizes the following Modgen state types and plots them by default according to their state type:

State Type Default Chart Type
Integer Level
Long Level
Floating Point Line
Double Line
Time Line
Logical Horizontal Bar
Classification Event
Range Level

The Line style draws one line between 2 adjacent points, whereas the Level style draws two lines (a vertical then a horizontal) between 2 points.  The Horizontal Bar, although most appropriate for logical, classification and range type of Modgen states can be used on all states.  For continuous states such as float or double, the horizontal bar uses colour interpolation from a start and end colour defined by the user.  No legend is available for Horizontal Bar.

The default colour for Line, Level and Point plots is blue.  The default colours for Horizontal Bar are white and gray.  You can control the line thickness, band width and point size for the biography but not at the level of a single state.  At present, these settings are global to the biography window.  All such settings are saved with the biography.

Double-click on a chart within a biography, or right click on it and select State Properties from the pop-up menu, to re-format it for chart type and colour.  A format string for the Y-Axis labels where appropriate can be changed at this time as well.  The following dialog box is used to set or adjust these properties:

Dialog box that allows state properties (chart type, colour) to be changed

If the chart type selected is Horizontal Bar a second colour will be presented for selection. For logical states, these will be the False and True colours. For all other states types these will be used as a start colour and end colour in a colour interpolation process.

Setting and saving display options

Use the Format menu commands to change display options for the active biography window. The View menu, which includes the ToolBar and Status Bar display options, are application global.

The Tools/Options menu can be used to set and save session defaults and display default options for new biographies. The Options dialog box consists of three tabs: General, Chart Defaults and Axes Defaults. The General options will take effect immediately, whereas the two Default tabs are used only with new biography creation. The OK button will save these defaults to your application ini file.

The General tab below sets and saves session defaults used at application startup. To change them during the session, use the View and Tools menus.

Dialog box showing Tools/Options/General tab

The chart defaults used for new biography creation consists of the following display options. To change these options for an already open biography, use the Format menu.

Dialog box showing Tools/Options/Chart Defaults tab

Axes defaults for new biography creation are set and saved within the third options tab. Axes properties for an open biography can be set by double clicking the axes area of the chart window or by using the Axes Properties  command from the Format menu.

Dialog box showing Tools/Options/Axes Defauls tab

Note that all display options and axes properties currently in effect for the open biography are saved with the biography file.

Sending a biography to the printer or clipboard

To print a biography, use File/Print or the Print button on the toolbar.

The printed biographies are sized to fit the page while maintaining their aspect ratio. The orientation used will depend on the aspect ratio of the biography window being printed, i.e., if the window is wider than tall, landscape will be used.

To send a biography to the clipboard, use Edit/Copy, Ctrl-C or the Toolbar copy button

Date modified:

What do we expect from the microsimulation model RiskPaths?

What can simulation add to statistical analysis?
Desired features of a RiskPaths microsimulation model

What can simulation add to statistical analysis?

Before we can answer the question of what simulation can add to statistical analysis, we first need a good understanding of what the statistical results presented in the previous section reveal. The estimation results for the two countries and two cohorts allow us to study similarities and differences between the countries, as well as the changes in parameters over time separately for each of the individual processes. We see a remarkable similarity in parameters across the two countries especially for the pre-transition cohorts. Bulgaria differs from Russia basically only in the three times lower union dissolution risks and the slower speed of second union formation. Accordingly, comparing the pre- and post-transition cohorts, we find dramatic changes in most processes. The risk of first births was halved in the first three years of the first union with no later recovery, although the parameters stayed relatively unchanged after three years in a union. Also, in second unions, fertility dropped by more than 50%. The biggest difference between the two countries after the transition is in first union formation--rates halved in Bulgaria but stayed stable in Russia. For first union dissolution we see the opposite picture--union dissolution risks increased by around 40% in Russia while staying almost unchanged in Bulgaria.

These are typical examples of insights we can gain by single process analysis. We have separated a complex system into its component processes and studied the changes within those processes. In the case of fertility we have introduced relative risks--we study how certain factors (here, different union statuses) influence a single process. This is a very typical analytical question; scientific literature is rich of this kind of research.

The power of microsimulation unfolds when we study various processes simultaneously. Even in our very simple demographic example, results are difficult to interpret when we are interested in the effect of changes in single processes on aggregate outcomes. For example, what is the effect of Russia's 40% increase in union dissolution risks on childlessness? The effect will depend on fertility out of unions and in second unions as well as the speed of second union formation. The relative risk of fertility is higher in second unions than after three years in the first union, but second union formation takes time (during which fertility is very low) and not all women enter a second union. Do these effects cancel themselves out or does union dissolution affect fertility - and in which direction? Such questions invite us to use microsimulation for sensitivity analysis. How do aggregate outcomes change in response to the change of a single parameter? Note that we now have moved analysis from the level of a single process to an analysis of system behaviour.

A comparison of the two cohorts invites a further type of system analysis--what is the relative contribution of the change in single processes to the aggregated outcome? Comparing the two simulated cohorts we see that childlessness has increased considerably in both countries but even more so in Bulgaria. We can use microsimulation to decompose the contributions of the changes in the various processes to the aggregate change. How much would childlessness have changed if only fertility parameters changed? What is the contribution of changes in union formation? Has the increase in union dissolution risk contributed to the increase in childlessness in Russia? Of course, the aggregate change is not the simple arithmetic sum of partial effects. Some process changes might have a stronger or weaker effect in the presence of changes in other processes. For example, the effect of the change in fertility in second unions will heavily depend on the likelihood of being in a second union which is subject to first union formation and dissolution risks. Microsimulation can help us to identify and better understand such interactions.

Looking at the post-transition cohort, we have already entered the domain of predictions. As data were collected 14 years after the transition, in reality no post-transition cohort has gone through its whole reproductive period. Thus, for cohort measures like childlessness, the assessment of consistency with other data sources is limited to a comparison with other projections. But we can also use our model for predictions under alternative assumptions on future changes in processes. We might have a theory that leads to the assumption that only parts of the observed changes are of a permanent nature (e.g. caused by cultural change) while others are transitory (e.g. resulting from economic crisis, therefore reversible with economic recovery). What would happen if fertility rates moved back to their initial values while slower (later) union formation persisted--or vice versa? Such an analysis can produce surprising results, as it is not always a reversal of the process which initially had the biggest overall impact that will generate the biggest opposite effect.

Are there policy implications? While our model is of course too simple for policy analysis, it does not require much imagination to see how microsimulation can support policy making.

  • In many cases, the studied phenomena are of direct policy relevance. Fertility decline will for example impact the sustainability of social security systems. A good demographic projection model can therefore produce valuable data input for subsequent good-quality planning. Microsimulation is the tool to combine separate statistical models into projection models.
  • Events simulated in a microsimulation model can also be policy targets themselves. A government might aim at influencing fertility. This is possible if policies exist which are capable of influencing the modeled processes. However, we first have to be able to understand the individual contribution of those processes to the aggregated outcome we aim to change; thus we need microsimulation. If we can attach price tags on such policies, we are also able to use microsimulation to find the most cost-efficient policy mix - and to study possible side effects. (Socialist Russia and Bulgaria actually had a set of powerful policies in place, such as bachelor taxes and privileged access to housing for young married couples. The price tag for regulating individual life choices turned out to be rather high.)
  • Microsimulation allows us to complement the models resulting from statistical analysis with detailed policy scenarios and economic accounting models. It provides a very natural tool for policy simulation, as policies are defined at the individual or micro level. This leads to applications which integrate demographic and economic modeling.

Desired features of a RiskPaths microsimulation model

Input: Parameter tables, scenarios, and simulation settings

Even being a very simple model, RiskPaths has around 130 parameter values which users should be able to set and store conveniently. We would expect these parameters to be well-organized in the microsimulation application, appearing as easy-to-access (or navigate) labelled tables which could be read or modified as required

When using a model we typically create different scenarios, i.e. different parameterizations of the model. We need to be able to save these scenarios so that certain simulations can be reproduced in future. Scenarios contain all parameter tables and, ideally, supplementary text descriptions or notes that outline the specific changes embedded in each scenario. Additionally, scenarios should include scenario settings, such as the number of simulated cases (given that RiskPaths is a case-based model), A large sample size will reduce Monte Carlo variation but comes at the cost of slower simulation runs. If we are only interested in broad aggregates, then smaller sample sizes might suffice. On the other hand, a detailed analysis of rare events or a detailed breakdowns of results (e.g. by age groups) would require large samples. Additionally, users might not wish to produce all available output. Narrowing down the desired output can again speed up simulations but also leads to a more concise and focused presentation of results according to user needs.

All of the above (parameter tables, descriptive notes, number of cases, choice of output to produce) is part of a scenario. For our RiskPaths applications, we would expect all this information to be stored together for a given scenario and we would expect it to be easily retrieved, viewed, and modified.

Output and output views

Microsimulation models can produce output on two levels: micro and macro. A microsimulation application could conceivably write all individual level characteristics and all their changes over time into a file and leave it to the user to analyse the resulting data file with statistical software. In the RiskPaths case, this would lead to a file storing the dates of all simulated events that occur over the simulated life course of each single individual. Only six events can happen in a simulated life, so each data record would contain at most six variables: four union formation / dissolution events, conception, and death. For more complex applications, file size and complexity could be enormous.

As well as such a longitudinal file, we might also be interested in cross-sectional output, recording the states of all individuals at a certain point in time. While the use of such a file is rather limited when simulating a single cohort, it would resemble a cross-sectional survey or population census in a population model.

Usually, a model user will not be interested in micro files per se but in the analysis that is performed on them. The user will typically aggregate data and produce summary indicators and tables. If model developers already know how simulated data will or should be analyzed, such measures and tables can already be calculated and produced within a microsimulation application run. In this case, users would not need to run additional statistical routines; they could see results immediately after a simulation was performed. In our RiskPaths model, output does not exceed a small number of tables and summary indicators which we expect to be produced within the application. We are interested in age-specific fertility rates, childlessness, the mean age at first conception, first conception by union status, and some mortality measures.

Just as with parameter tables, aggregated model output also requires organization. We might want to present some summary measures of one or several related behaviours together in a table and we surely want to order table output in a meaningful way. Additionally, as with parameters, we would expect table results to be labelled for easy reading and understanding.

Because all microsimulation results are subject to Monte Carlo variation, aggregated numbers are only one view of the results. We might also be interested in getting distributional information on each table value. Such information would help us to set an appropriate population size sufficient for a desired level of result precision.

A special type of micro-data output is the graphical display of individual careers. This can be a helpful feature, as it provides users with a window to the simulated individuals, and thus a way to see the operation of the statistical models. This can also be useful for model developers as it supports model debugging. Since RiskPaths is a training tool, we are interested in displaying how individual biographies result from statistical processes. Thus, besides life course events, we might also want to see how the risks of the alternative events change over time and life course situations.

User interface and documentation

So far, we have formed expectations about the content, display, and organization of model input and output data. From the user perspective, do we just have to add a start button to complete the microsimulation application? Almost all contemporary software applications contain help files. As users of microsimulation models, we should expect access to detailed online help, not only on the use of the modeling software itself but also on the model's specific elements and the interrelationships amongst those elements.

Date modified:

Exploring the Modgen application RiskPaths

The user interface
Parameter tables
Performing a simulation run
Table output: aggregates and distributions
Model help and documentation
Graphical output of individual histories

In the remainder of this discussion we provide a quick explorative tour of the visual interface provided by Modgen for the RiskPaths model. To run RiskPaths, both the Modgen Prerequisites application and the RiskPaths executable have to be installed on your computer. As is true for all Modgen applications, RiskPaths contains a help system including documentation on the (model-independent) Modgen user interface and the actual RiskPaths model itself. Accordingly, in the description below, we concentrate on the central steps of running RiskPaths, leaving it to you to explore the model and software in depth with the assistance of the detailed help files.

The user interface

All Modgen applications have the same graphical user interface (Figure 1) which consists of the following parts:

  • a menu bar and a toolbar to administer and run scenarios, as well as to get help
  • a selection window containing a hierarchically grouped list of all model parameters and output tables
  • a frame in which all corresponding parameters or tables can be displayed

When starting the RiskPaths.exe application, the selection window and table frame are empty, as we first have to load (or create) a simulation scenario. To do so, follow the following steps:

  • Open the simulation scenario 'base.sce'. This can be done by clicking the 'Open' button or by selecting 'Open.' from the 'Scenario' menu.
  • Choose the scenario settings--the settings dialog box can be accessed by clicking the 'Settings' button or by selecting 'Settings.' from the 'Scenario' menu. Specify a small number of simulated cases (e.g.10,000) so that your first model runs quickly. Also, ensure that 'MS Access tracking' is switched on. This will allow you to view individual biographies using the BioBrowser tool that comes with Modgen
  • Save your scenario under a new name by selecting 'Save as.' from the 'Scenario' menu.

User interface

Parameter tables

Users of a Modgen application have control over all parameters contained in the model's parameter tables. An individual parameter table can be selected by clicking its list entry in the selection window. The table is then displayed in the display frame in which it can also be edited. Modgen parameter tables can have any number of dimensions, ranging from a parameter with a single checkbox to parameters with numerous characteristics or dimensions (e.g.region, sex, age, time).

Parameter table

Performing a simulation run

Click the 'Run/resume' button or select 'Run/resume' from the 'Scenario' menu. The progress of the simulation is displayed in a progress dialog box. A small sample of 10,000 actors takes around 20 seconds to run. After the model run is complete, all output tables will have been updated by Modgen.

Table output: aggregates and distributions

Simulation results are written to predefined output tables. Note that the values displayed in the output table represent only one of several possible views on the results. By right-clicking a table, a properties sheet for the table can be accessed. Among other things, this allows the display of distributional information (standard errors and the coefficient of variation) of all simulated values. Table contents can also be copied and pasted. You have the choice to copy the table as displayed, or all dimensions of the table at once (if there are more than two dimensions).

Model help and documentation

As is true with all Modgen applications, RiskPaths provides help files of various types. Two are related to Modgen itself--a general user guide for the visual interface plus release notes for Modgen. The other help files are model-specific. All Modgen applications contain a detailed encyclopaedic model documentation file. This documentation is automatically created from properly commented code.

Model Help

Graphical output of individual histories

The Modgen Biography Browser (BioBrowser) application is a tool for the graphical display of individual life courses. This view on the simulation results is especially useful for model debugging. In order to use the tool, the tracking feature has to be switched on in the scenario settings. The list of variables to be tracked also has to be declared by the model developer in the model code via a tracking statement. Modgen than tracks all changes of those variables included in the tracking statement for a sample of simulated actors (where the size of this sample is specified as one of the scenario settings).

To display biographies created by RiskPaths, just start the BioBrowser application and load the tracking-file of your simulation scenario, e.g.Base(trk).mdb.

Graphical Output

 

Date modified:

Organization of files

RiskPaths.mpp (the main simulation file)
PersonCore.mpp
Behavioural Files
Tables.mpp
Tracking.mpp
Language translation file RiskPathsFR.mpp

The Modgen code of RiskPaths is organized into eight separate .mpp files, while all RiskPaths parameter values (because RiskPaths is a simple model) are contained in just a single .dat file. In principle, a model developer has complete freedom to decide how to organize the Modgen code in different files, but a modular organization as found in RiskPaths is recommended.

Figure 2: RiskPaths file organization

Figure 2: RiskPaths file organization
Generated modules Filename
Simulation engine RiskPaths.mpp
Core actor file PersonCore.mpp
Table definitions Tables.mpp
Output tracking Tracking.mpp
French language translations RiskPathsFR.mpp
Behavioural modules Filename
Mortality Mortality.mpp
Fertility Fertility.mpp
Union formations and dissolutions Unions.mpp
Parameter File Filename
Parameters of Baseline Scenario Base(RiskPaths).dat

Note that the .mpp code files often contain comments that resemble labels. Such comments are placed beside the declarations of symbols such as states, state levels, parameters, tables and table dimensions. Modgen does in fact interpret these comments as labels and subsequently uses them when tables or parameters are displayed within Modgen's visual interface. These labels are also used in the model's automatically generated encyclopaedic help file. Code comments that are used as labels begin with a two-character language identifier; an example is:

//EN Union status

Many such comments can be seen in the code examples that follow for RiskPaths.

For more detailed descriptions of modules, functions and events, notes that use the following syntax can also be placed in the code.

/*NOTE(Person.Finish, EN)
The Finish function terminates the simulation of an actor.
*/

These notes - besides documenting the code - are additionally used in the automatically generated encyclopaedic help file.

RiskPaths.mpp (the main simulation file)

This file contains the code essential for the definition of the model type (e.g.case-based, continuous time) as well as the simulation engine, i.e. the code that runs the entire simulation. Because RiskPaths is a case-based model, the simulation engine code loops through all cases and processes the event queues of each case. The file also identifies the languages of the model. The code in this file is mostly model independent within a class of models (e.g.continuous time, case-based) and a version of it is provided automatically when using Modgen's built-in wizard to start a new Modgen project.

For the development of our case-based, continuous time cohort RiskPaths model with an actor 'Person', the code provided by the wizard requires very few modifications. The full code of this .mpp file is less than one page in length.

PersonCore.mpp

The only actor in RiskPaths is a person. In the file PersonCore.mpp, we have organized the code which is part of the actor declaration but not directly related to a specific behaviour. The file contains two age clocks defined as self-scheduling states (integer_age and age_status) and two actor functions, Start() and Finish(), which are performed at the creation of an actor and at her death, respectively.

In the Start() function we initialize the states time and age to 0. Both states are automatically created and maintained by Modgen and can only be changed in the Start() function. Their types depend on the model type; because RiskPaths is a continuous time model, time and age are continuous states.

The Finish() function must be called at the death event of an actor. Its role is to remove the actor from tables and from the simulation, and to recuperate any computer memory used by the actor.

All states and actor functions are declared in an "actor Person { };" block. To allow modularity in the organization of code by different life course domains, there can be multiple actor blocks in a project, typically one for each behavioural file.

The first code section of this module contains three type definitions. We first define a range LIFE.

range LIFE //EN Simulated age range
{
0,100
};

Range is a Modgen type which defines a range of integer values. RiskPaths limits the possible age range of persons to 100 years. This type is used to declare a derived state containing the age of a person in completed years.

The second type definition is used to divide continuous ages into 2.5 year age intervals starting at age 15.

partition AGEINT_STATE//EN 2.5 year age intervals
{
15, 17.5, 20, 22.5, 25, 27.5, 30, 32.5, 35, 37.5, 40
};

The third definition is a classification of union types. In general, if a range, partition, or classification is used in several files, it is good practice to define it in the core actor file.

classification UNION_STATE //EN Union status
{
US_NEVER_IN_UNION,//EN Never in union
US_FIRST_UNION_PERIOD1,//EN First union < 3 years
US_FIRST_UNION_PERIOD2,//EN First Union > 3 years
US_AFTER_FIRST_UNION,//EN After first union
US_SECOND_UNION,//EN Second union
US_AFTER_SECOND_UNION//EN After second union
};

In the following code segment we declare two derived actor states and two functions. The derived states for time intervals are used to change the values of parameters that vary over time. In our model integer_age is needed because mortality risks are dependent on age in years, whereas age_status comes into play because baseline risks for first conception and first union formation are modelled to change in 2.5 year intervals after the 15th birthday.

Both integer_age and age_status have to be maintained over the simulation. The Modgen concept of derived states allows us to have them maintained automatically. Both are derived from the state, age, (which is a special state that is generated and maintained automatically by Modgen). In order to split up age into the time intervals defined in the AGEINT_STATE partition, we make use of the Modgen function self_scheduling_split. The second derived state, integer_age, can be directly obtained using the Modgen function self_scheduling_int. In order to ensure that its value stays in the possible range of LIFE we convert it to type LIFE, which is done by the Modgen macro COERCE.

actor Person
{
//EN Current age interval
int age_status = self_scheduling_split(age, AGEINT_STATE);

//EN Current integer age
LIFE integer_age = COERCE( LIFE, self_scheduling_int(age) );

//EN Function starting the life of an actor
void Start();

//EN Function finishing the life of an actor
void Finish();
}

The remaining code of this module is the implementation of the Start() and Finish() functions. The Finish() function is left empty as we do not require any actions, other than those automatically performed by Modgen, to take place when an actor dies.

void Person::Start()
{
// Age and time are variables automatically maintained by
// Modgen. They can be set only in the Start function
age = 0;
time = 0;
}

/*NOTE(Person.Finish, EN) The Finish function terminates the simulation of an actor.
*/

void Person::Finish()
{
// After the code in this function (if any) is executed,
// Modgen removes the actor from tables and from the simulation.
// Modgen also recuperates any memory used by the actor.
}

Behavioural Files

In RiskPaths we distinguish three groups of behaviours: mortality, fertility and union formation/dissolution. Accordingly we have organized the code into three .mpp files: Mortality.mpp, Fertility.mpp and Unions.mpp. Each behavioural file is typically arranged in three sections:

Declaration of parameters (including any type definitions that are required, first)

Declarations of actor states and events

Implementation of events

Mortality.mpp

This file defines the mortality event that ends the life of the simulated actor. Mortality.mpp is a typical behavioural module, and we follow a standard organization of the code: parameter declarations (with type definitions), actor declarations and event implementations.

Parameter declarations

Mortality is parameterized by death probabilities by age; thus, the probability to survive another year changes at each birthday. We also introduce a parameter which allows us to 'switch off' mortality. When it is used, every actor reaches the maximum age of 100 years (which can be useful for some types of fertility analysis). Figure 3 displays the mortality parameter tables of the RiskPaths application.

Figure 3: Mortality parameters

Mortality Parameters

Parameters are declared within a "parameters {.};" code block. Modgen supports standard C++ numeric types, such as int, long, float, double, or Boolean ("logical" in Modgen's terminology), as well as the Modgen-specific range, partition and classification types that were introduced in PersonCore.mpp. The dimensionality of the parameters in the RiskPaths model is defined by classifications and ranges. The following code generates the parameters for RiskPaths, as displayed in Figure 3. For the annual death probabilities we use the range LIFE that was defined in PersonCore.mpp. The (parameter_group) statement groups the two mortality parameters in order to provide an ordered hierarchical selection list in the user interface (again, as displayed in Figure 3).

parameters
{
logical CanDie;//EN Switch mortality on/off
double ProbMort[LIFE];//EN Death probabilities
};

parameter_group P01_Mortality//EN Mortality
{
CanDie, ProbMort
};

Actor declarations

Actors are described by states which are changed in events. States can be both continuous (integer or real) or categorical. In the mortality module, the state of interest is whether a person is alive or not, thus making it categorical in nature. The levels of a categorical state are defined with the Modgen classification command.

We declare a state life_status of type LIFE_STATE, which is initialized with LS_ALIVE at birth and set to LS_NOT_ALIVE by the death event. It is good practice to initialize all states by assigning initial values. Each initial value, however, must be enclosed in braces, i.e. {}-otherwise, the state is implemented as a derived state.

classification LIFE_STATE //EN Life status
{
LS_ALIVE,//EN Alive
LS_NOT_ALIVE//EN Dead
};

actor Person
{
LIFE_STATE life_status = {LS_ALIVE};//EN Life Status
event timeDeathEvent, DeathEvent;//EN Death Event
};

Events are declared in the actor Person {..} block using the keyword event. All events consist of a function which returns the time of the next event and a function containing the code describing the consequences of the event.

Event implementation

When mortality is activated, the timeDeathEvent function returns a random time based on the mortality parameter for the given year of age. In order to obtain random durations from probabilities, we assume constant mortality hazards within each period, i.e. between birthdays. (The exception is a death probability of 1, which leads to death immediately at the start of the age year). Note that any time later than the next birthday will lead to the birthday event taking precedence over the mortality event; that is, the birthday event will censor the mortality event.

TIME Person::timeDeathEvent()
{
TIME event_time = TIME_INFINITE;
if (CanDie)
{
if (ProbMort[integer_age] >= 1)
{
event_time = WAIT(0);
}
else
{
event_time = WAIT(-log(RandUniform(3)) /
-log(1 - ProbMort[integer_age]));
}
}
// Death event can not occur after the maximum duration of life
if (event_time > MAX(LIFE))
{
event_time = MAX(LIFE);
}
return event_time;
}

The event implementation function DeathEvent is straightforward. It sets the life_status to LS_NOT_ALIVE and calls the function Finish(), the latter removing the actor from the simulation and recovering any memory used by that actor.

void Person::DeathEvent()
{
life_status = LS_NOT_ALIVE;
Finish();
}

Fertility.mpp

This file defines and implements the first pregnancy event. As we are only interested in the study of childlessness in RiskPaths, no other fertility-related event is simulated. Fertility.mpp is a behavioural module, and again we follow the same standard organization of the code: type definitions, parameter declarations, actor declarations and event implementations.

Parameter declarations

Fertility is parameterized by both a baseline pregnancy risk by 2.5 year age intervals starting at the 15th birthday and a relative risk factor dependent on the union status and duration. We thus define two parameters: AgeBaselinePreg1 and UnionStatusPreg1.

Figure 4: Fertility parameters

Fertility parameters

Fertility risks use a time partition to define the columns. For the age baseline we use the partition AGEINT_STATE that was defined in PersonCore.mpp. The possible union states for the relative risk factors use the classification UNION_STATE which is declared in PersonCore.mpp as well.

parameters
{
//EN Age baseline for first pregnancy
double AgeBaselinePreg1[AGEINT_STATE];
//EN Relative risks of union status on first pregnancy
double UnionStatusPreg1[UNION_STATE];
};
parameter_group P02_Ferility //EN Fertility
{
AgeBaselinePreg1, UnionStatusPreg1
};

Actor declarations

The only state of the fertility module is parity_status, which can only have two levels: 'childless' and 'pregnant'. (This is because RiskPaths no longer simulates an actor's fertility events after first conception).

In Fertility.mpp, we only model one event: pregnancy. The corresponding pair of event functions is timeFirstPregEvent and FirstPregEvent.

classification PARITY_STATE //EN Parity status
{
PS_CHILDLESS,//EN Childless
PS_PREGNANT//EN Pregnant
};
actor Person
{
//EN Parity status derived from the state parity
PARITY_STATE parity_status = {PS_CHILDLESS} ;
//EN First pregnancy event
event timeFirstPregEvent, FirstPregEvent;
};

Event implementation

As is true with all Modgen events, the first pregnancy event is implemented in two parts. The first determines the timing of the event, the second the consequences if the event happens. The timeFirstPregEvent function verifies if the actor is currently at risk and, if so, draws a random duration based on the underlying piecewise proportional constant hazard regression model parameterized by an age baseline and relative risk by union status. Accordingly, the hazard rate is calculated from the two parameters AgeBaselinePreg1 and UnionStatusPreg1. A random duration can be obtained from a uniform distributed random number by the transformation:

randdur=-log(RandUniform(1))/hazard.

The Modgen function RandUniform() returns a uniform distributed random number between 0-1. The function takes an integer argument used to assign a different independent random number stream to each random number function in the code. When omitted, Modgen automatically writes back a unique index into the .mpp file before translation into C++ code.

When the event happens, the state "parity" is increased by 1. (Note that the derived state parity_status is changed to "PS_PREGNANT" automatically).

TIME Person::timeFirstPregEvent()
{
double dHazard = 0;
TIME event_time = TIME_INFINITE;
if (parity_status == PS_CHILDLESS)
{
dHazard = AgeBaselinePreg1[age_status]
* UnionStatusPreg1[union_status];
if (dHazard > 0)
{event_time = WAIT(-log(RandUniform(1)) / dHazard);
}
}
return event_time;
}
void Person::FirstPregEvent()
{
parity++;
}

Unions.mpp

The programming of union transitions introduces only minor new concepts in Modgen programming--thus, the following code discussion is mainly limited to union dissolutions. The hazard rates for both first and second union dissolution events are stored in the same parameter table, as they each use the same time intervals of union duration.

In order to construct a parameter with the dimensions time and union order, we define a time partition and a classification:

partition UNION_DURATION//EN Duration of current union
{
1, 3, 5, 9, 13
};
classification UNION_ORDER //EN Union order
{
UO_FIRST,//EN First union
UO_SECOND//EN Second union
};

parameters
{
.
//EN Union Duration Baseline of Dissolution
double UnionDurationBaseline[UNION_ORDER][UNION_DUR];
.
};

Figure 5: Union dissolution parameters

Union Dissolution

In the timeUnion1DissolutionEvent() function, hazard rates for first union dissolution are obtained as:

dHazard = UnionDurationBaseline[UO_FIRST][union_duration];

 

Accordingly, timeUnion2DissolutionEvent() references the second row from the parameter:

dHazard = UnionDurationBaseline[UO_SECOND][union_duration];

As opposed to the processes discussed so far, the union dissolution processes do not start at a predefined time (e.g.the 15th birthday) but at union formation events. The union duration spell is defined as a derived self-scheduling state in the following form:

//EN Currently in an union
logical in_union = (union_status == US_FIRST_UNION_PERIOD1
|| union_status == US_FIRST_UNION_PERIOD2
|| union_status == US_SECOND_UNION);
//EN Time interval since union formation
intunion_duration = self_scheduling_split(
active_spell_duration( in_union, TRUE), UNION_DURATION);

With respect to union formation, the implementation of the clock which changes the union duration state union_status from US_FIRST_UNION_PERIOD1 to US_FIRST_UNION_PERIOD2 after three years in a first union deserves some discussion. In contrast to the self-scheduling derived states used for all other clocks of the model, here - mainly as an illustration of this alternative - we explicitly implement the clock as an event itself. This event occurs after three years in the first union. The clock is set at first union formation. The actor declaration includes a state which records the time of the status change as well as the event declaration.

actor Person
{
.
//EN Time of union period change
TIMEunion_period2_change = {TIME_INFINITE};

//EN Union period change event
eventtimeUnionPeriod2Event, UnionPeriod2Event;
};

The time for the state change is set in the first union formation event. In the code sample, WAIT is a built-in Modgen function that returns the time of the current event, plus a specified time (in our example, three years).

void Person::Union1FormationEvent()
{
unions++;
union_status = US_FIRST_UNION_PERIOD1;union_period2_change = WAIT(3);
}

The event implementation is straight forward:

TIME Person::timeUnionPeriod2Event()
{
return union_period2_change;
}

void Person::UnionPeriod2Event()
{
if (union_status == US_FIRST_UNION_PERIOD1)
{
union_status = US_FIRST_UNION_PERIOD2;
}
union_period2_change = TIME_INFINITE;
}

Tables.mpp

Modgen provides a very powerful and flexible cross-tabulation facility to report model results. The programming of each output table usually requires only a few lines of code. RiskPaths contains only one table file which contains the declarations of all of its output tables-however, for more detailed models, it is advisable to split up table declarations by behavioural groups.

The basic syntax for tables is displayed in Figure 6. The two central elements of a table declaration are the captured classificatory dimensions (defining when an actor enters and leaves a cell) and the analysis dimension (recording what happens while an actor is in that cell). Typical classificatory dimensions are age or time intervals (e.g.fertility by age), states (e.g.fertility by union status), or a combination of both. Modgen does not limit the number of dimensions.

The analysis dimension can contain many expressions, which can be states or derived states. Modgen provides a very useful list of special derived state functions which record, for example, the number of occurrences of certain events, the number of changes in states, or the duration in states. Two particularly helpful concepts are the keyword unit and the derived state function duration(). The former, i.e. unit, records the number of actors entering a table cell, whereas duration() records the total time an actor stayed in the cell.

Tables can contain filter criteria for defining if and under which conditions actor characteristics will be recorded. The Modgen table concepts are best understood by concrete examples as given below. As the full wealth of the Modgen table language goes beyond the scope of this chapter, you are also invited to consult the Modgen Developer's Guide.

Figure 6: Table Syntax

table actor_name table_name//EN table label
[filter_criteria]
{
dimension_a *//EN dimension label
.
{
analysis_dimension_expression_x,//EN expression label
.
}
* dimension_n//EN dimension label
.
};

Table 1: Life expectancy

The first table example contains summary values of our simulation and has no dimensions, i.e. cells apply to the entire population over the entire simulation period. We make use of the Modgen keyword unit, which counts the number of actors entering the cell of a table (in our example, the simulation itself), and the Modgen function duration() which sums up the time actors stay in this cell (in our example, the total years lived by all actors in the simulation). The average age at death of all actors in the simulation is then obtained by dividing duration() by unit. As for parameter declarations, comments placed in the code are used as labels in the application. (Note that in the table declaration below, the 'decimals=3' portion of the comment is used to determine the number of decimal places in the table; this part of the comment does not carry through to the label used in the report).

table Person T01_LifeExpectancy //EN 1) Life Expectancy
{
{
unit, //EN Total simulated cases
duration(), //EN Total duration
duration()/unit //EN Life expectancy decimals=3
}
};

Table 2: Life table

In the second table we record the population by age. For output by age, we use integer_age as table dimension.

table Person T02_TotalPopulationByYear//EN Life table
{
//EN Age
integer_age *
{
unit,//EN Population start of year
duration()//EN Average population in year
}
};

unit and duration() now refer to the number of entrances into - and durations within - one year age intervals. unit thus counts the actors present at the beginning of each year, while duration() refers to the average population in the year.

Tables 3 and 4: Age-specific fertility

As well as the keyword unit and the derived state function duration(), states and a set of other derived state functions can be used in tables. If using a state without a function, Modgen records the change of the state while in a particular cell, i.e. the value of the state when the cell is exited minus the value of the state when the cell was entered.

The expression transitions(parity_status, PS_CHILDLESS, PS_PREGNANT) / duration() therefore records the (age specific) fertility as the number of birth events divided by the average number of women by year of age.

The second expression is used to calculate the true rate, i.e. the number of birth events by exposure time. A woman is under exposure for first pregnancy when childless. We thus divide the number of events by the term 'duration( parity_status, PS_CHILDLESS )'.

The table dimension is age in full years. As fertility is 0 until age 15 and very low after 40, the age periods before 15 and after 40 are not further divided. We thus define a partition AGE_FERTILEYEARS which is used in the self_scheduling_split statement that defines the table dimension.

partition AGE_FERTILEYEARS //EN Fertile age partition
{
15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
32, 33, 34, 35, 36, 37, 38, 39, 40
};

table Person T03_FertilityByAge//EN Age-specific fertility
{
//EN Age
self_scheduling_split(age,AGE_FERTILEYEARS) *
{
//EN First birth rate all women decimals=4
transitions(parity_status, PS_CHILDLESS, PS_PREGNANT) / duration()
//EN First birth rate woman at risk decimals=4
transitions(parity_status, PS_CHILDLESS, PS_PREGNANT) / duration( parity_status, PS_CHILDLESS )
}
};

Table 4 produces first birth rates by the 2.5 year age groups used for parameterization. We also add an additional dimension, namely the union status; we thus obtain simulated values of the model parameters.

table Person T04_FertilityRatesByAgeGroup //EN Fertility rates by age group
[parity_status == PS_CHILDLESS]
{
{
parity / duration() //EN Fertility decimals=4
}
* self_scheduling_split(age, AGEINT_STATE)//EN Age interval
* union_status//EN Union Status
};

Table 5: Cohort fertility

Table 5 calculates two cohort measures of fertility -- average age at first conception and childlessness. To obtain the age at pregnancy we use the Modgen derived state function value_at_transitions(parity_status, PS_CHILDLESS ,PS_PREGNANT, age) which returns the value of one state (age) at a specific transition of another state, namely when parity_status changes from PS_CHILDLESS to PS_PREGNANT.

table Person T05_CohortFertility//EN Cohort fertility
{
{
//EN Av. age at 1st pregnancy decimals=2
value_at_transitions(parity_status,PS_CHILDLESS,PS_PREGNANT,age)/
transitions(parity_status, PS_CHILDLESS, PS_PREGNANT),
//EN Childlessness decimals=4
1 - transitions(parity_status, PS_CHILDLESS, PS_PREGNANT) / unit,

//EN Percent one child decimals=4
transitions(parity_status, PS_CHILDLESS, PS_PREGNANT) / unit
}
};

Table 6: Pregnancies by union status and order

In table 6 we use an example of a filter which triggers a person exactly at the entrance of a state, in our case at the occurrence of pregnancy. We are interested in the union status at first conception. Note that this filter also excludes women who stay childless.

table Person T06_BirthsByUnion //EN Pregnancies by union status & order
[trigger_entrances(parity_status, PS_PREGNANT)]
{
{
unit//EN Number of pregnancies
}
*union_status+//EN Union Status at pregnancy
};

Table 7: First union formation risks

Like table 4 this table reproduces a parameter table. While such an output table does not contain any information (for a sufficiently large sample size it will come close to the original model parameters) it is useful for model validation and to assess Monte Carlo variability.

table Person T07_FirstUnionFormation//EN First union formation
[parity_status == PS_CHILDLESS]
{
//EN Age group
self_scheduling_split(age, AGEINT_STATE) *
{
//EN First union formation risk decimals=4
entrances(union_status, US_FIRST_UNION_PERIOD1)
/ duration(union_status, US_NEVER_IN_UNION) }
};

Grouping of table output

Like parameters, output tables can also be grouped for a more meaningful presentation of results. In the application RiskPaths, we distinguish three groups of tables: life tables, fertility tables, and tables for union status.

table_group TG01_Life_Tables//EN Life tables
{
T01_LifeExpectancy, T02_TotalPopulationByYear
};

table_group TG02_Birth_Tables//EN Fertility
{
T03_FertilityByAge, T04_FertilityRatesByAgeGroup, T05_CohortFertility
};

table_group TG03_Union_Tables//EN Unions
{
T06_BirthsByUnion, T07_FirstUnionFormation
};

Tracking.mpp

The track{} code block defines the list of states to be recorded longitudinally for visual BioBrowser output. This command is frequently placed in table files. In our model, however, we have decided to code a separate Tracking.mpp file, since we also track risk patterns calculated as derived states.

track Person
{
integer_age,
life_status,
age_status,
union_duration,
dissolution_duration,
unions,
parity_status,
union_status,
preg_hazard,
formation_hazard,
dissolution_hazard
};

The file also includes the declaration of three derived states. We have used the derived state concept to calculate the three main hazard rates (pregnancy, union formation, and union dissolution) for BioBrowser output. They are for illustrative purposes only, as all hazard rates, broken down by union order, are calculated in the event functions.

The declaration of the derived states preg_hazard, formation_hazard, and dissolution_hazard are also good syntax examples of how derived states can be built from simple states by if-else constructs.

actor Person
{
//EN Pregnancy hazard
double preg_hazard = (parity_status == PS_CHILDLESS) ?
AgeBaselinePreg1[age_status] *
UnionStatusPreg1[union_status] : 0;

//EN Union formation hazard
double formation_hazard = (union_status != US_NEVER_IN_UNION
&& union_status != US_AFTER_FIRST_UNION) ? 0 :
((union_status == US_NEVER_IN_UNION) ?
AgeBaselineForm1[age_status] :
SeparationDurationBaseline[dissolution_duration] );

//EN Union dissolution hazard
double dissolution_hazard = (union_status != US_FIRST_UNION_PERIOD1 && union_status != US_FIRST_UNION_PERIOD2
&& union_status != US_SECOND_UNION) ? 0 :
((union_status == US_SECOND_UNION) ?
UnionDurationBaseline[UO_SECOND][union_duration] :
UnionDurationBaseline[UO_FIRST][union_duration]);
};

Language translation file RiskPathsFR.mpp

This .mpp file only exists for models that are defined in Modgen to be multilingual (which for RiskPaths implies English and French). Even for a bilingual model, however, one of English or French is still deemed to be the first or primary language of the model. English was chosen as the primary language when RiskPaths was originally developed, and so the RiskPathsFR.mpp file essentially contains translations for the model's labels and notes in the other language, i.e. French. (If the original primary language of RiskPaths had been French, this translation file would have been called RiskPathsEN.mpp and it would have contained English translations of the labels and notes for the model.)

Normally, all notes and labels are entered as code comments in the source .mpp files, using the primary language of the model, as has been illustrated several times in the previous examples. The corresponding translations are subsequently placed in this separate .mpp file.

An add-in "Translation Assistant" software tool is available with Modgen to help the translation process by verifying the complete set of terms that require translation and by adding unique versions of the translated entities to the appropriate .mpp file (RiskPathsFR.mpp in our example).

Date modified:

Introduction to Modgen

PDF Version (PDF, 365.73 KB)

Introduction

Modgen was designed to facilitate microsimulation model programming. Its purpose is to remove as many obstacles to microsimulation model creation as possible. Some of these obstacles are:

  • Interface programming
  • Documentation
  • Simulation engine programming
  • Bilingualism

Modgen eliminates all these obstacles as it provides the interface and the simulation engine, is bilingual and facilitates documentation. Obviously, this list is not exhaustive and is only given as an example.

One benefit of using Modgen to create microsimulation models is that there is no need to hire programmers to program them. Modgen takes care of most of the programming. What remains is usually simple enough that a person with no advanced knowledge of programming can do it. Having in-depth knowledge of programming can even be detrimental since the developer then has to limit himself to a programming style that will be understood by the analyst responsible for the model.

Also, it is not necessary to understand in detail what Modgen does in order to be able to use it. However, it might be useful to have a general idea of what it does, and providing such information is the purpose of this document.

Modgen installation

There are two products, and thus two installation programs, associated with Modgen. These products are:

  • Modgen Prerequisites 12
  • Modgen 12

Here is what each of those products contains and a description of what they do.

Modgen Prerequisites 12

In general, Modgen Prerequisites 12 contains everything necessary for models to run on a machine. If a developer wants to distribute a model, model users must install Modgen Prerequisites 12 before installing the model. This product contains:

  • Modgen model user licence
  • Modgen model user documentationThis documentation is of concern primarily to users of the model's interfaces. It does not give any model creation assistance. This documentation includes:
    • The Modgen 12 Guide to the Visual Interface
    • The Modgen Prerequisites 12 Release Notes
  • Cleaner applicationThis application cleans up the temporary files left on a machine when a model simulation fails. If a simulation ends successfully, no temporary files will remain. It is installed with Modgen Prerequisites 12 and is run in each new Windows session. It is also possible to run the application manually but you must avoid doing this when a simulation is in progress.
  • Modgen automated serverA component that allows other programs to read scenario parameter file records.
  • Microsoft PrerequisitesSome files provided by Microsoft that are needed to run Modgen models.

Modgen 12

In general, Modgen 12 contains everything needed for model development, although Visual Studio 2015 or 2017 is also required (but not included in Modgen 12!) Specifically, this product contains:

  • Modgen prerequisites
  • Modgen model creation licence
  • Everything necessary to create new models:This includes the Modgen precompiler, the Modgen library and header files.
  • Documentation for model developers:This documentation is of interest to model developers. It provides concrete assistance on model creation, syntax to be used, etc. It includes:
    • Modgen 12 Developer's Guide
    • Modgen 12 Release Notes
  • Integration with Visual Studio capabilities

Modgen components

Three main components make up Modgen. These are:

  • The Modgen precompiler
  • The Modgen library
  • The automated server

The first two components are used together to create models and are described in this document. The automated server, which makes it possible to access functions that read from and write to parameter files, will not be discussed here.

As a reminder, the chart below shows the links among the files needed to create executable simulation model files as part of an application. It also shows the inputs/outputs that executable model files need.

Chart showing the executable simulation model
Description: Chart showing the executable simulation model

The Modgen Precompiler takes the source code (.mpp files) and produces the following C++ files:

  • actors.h
  • actors.cpp
  • tabinit.h
  • tabinit.cpp
  • model.h
  • model.rc
  • parse.inf

These files, along with the Modgen Library, are compiled by the C++ Compiler to create an executable. This executable reads .sce and .dat files and produces tables.

Note that Modgen is not independent from Visual Studio. Model developers must have the appropriate version of Visual Studio to be able to create models (Visual Studio 2015 or 2017, at least the Standard Edition, for Modgen 12).

Precompiler

The precompiler is the executable file "Modgen.exe". The Modgen language in which developers write models is an extension of the C++ language. Before being compiled, model code must be converted to C++ code. That is what the precompiler does. It reads the .mpp files containing the model code and creates equivalent .cpp files, along with special files needed for model creation. Here is a more detailed description of what Modgen does at the precompilation stage:

  • Checks that the syntax is correct for every Modgen language element. The C++ syntax is then checked by the C++ compiler.
  • Checks that essential functions such as "Simulation" are in the model code.
  • Figures out the actors and creates the classes by bringing together all the actor declarations scattered throughout the .mpp files.
  • Figures out the desired tables and creates classes for tables.
  • Figures out the relationships among states and other symbols (derived states, events, tables) and creates the code necessary to maintain those relationships.
  • Creates the code necessary to update the time for each actor.
  • Creates an internal structure allowing the model to create a model help system independently as needed.

Note that this list is not exhaustive. The precompiler is called automatically when a model is built.

Library

The Modgen library contains the entities that all models have in common. At compilation time, links are made with the Modgen library to incorporate it into the model's executable file. In this way, each model is an independent executable file. Amongst other things, the library contains:

  • The interface for all Modgen models
  • The simulation engine
  • The help generator

The C++ language requires certain header files in order to incorporate the library into models. These files are installed with Modgen. In addition to the Modgen library, there are other Microsoft libraries and header files supplied with Visual Studio and incorporated into models.

Bilingualism

Canadian law requires that all software applications created by a Government of Canada agency be bilingual - i.e. that English and French be treated equally. Since Modgen is a software application created at Statistics Canada, it must be bilingual. However, in Modgen's case, more than one level of bilingualism is required. In fact, Modgen is an application that allows other applications to be created. It is thus not only necessary for Modgen to be bilingual, but also that it allows bilingual applications to be created.

Precompiler

Modgen itself is bilingual. English and French are treated as equally as possible. The language chosen must be given in an argument on the precompiler's command line.

Example:

c:\Program Files\StatCan\Modgen 12\Modgen.exe -EN

In practice, the Modgen precompiler is called when building a model. The language used by Modgen can be changed using and utility installed with Modgen called Langue Modgen.

Models

Modgen contains everything needed for model interfaces to be bilingual - i.e. in English and French. In addition, model developers can very easily use languages other than English and French. All that is necessary is to translate one of the files containing all the Modgen character strings (ModgenEN.mpp or ModgenFR.mpp) and include it as a module in the model.

Even though Modgen is bilingual, an interface will only have the language(s) defined in the model. To make a model bilingual, it must be defined as bilingual. Here is how to define the languages used by the Government of Canada:

// These are the languages in which the model can be viewed. 
languages 
{
	EN,	// English
	FR	// Français
};

Remember that all models distributed by the Government of Canada must be bilingual. This might not be clear if a model is a software package or a data product. However, it makes no difference since in both cases it must be fully bilingual if it is produced by a Government of Canada agency.

Modgen model elements

Modgen can be used to create two different kinds of models:

  • Case-based modelsA case-based model is a model where the simulation takes place case by case. A case typically consists of a main actor with, if required, secondary actors surrounding the main actor. However, in some models, there can be more than one main actor per case if, for example, a case simulates a family. The number of cases to be simulated is defined as a run control parameter. The "LifePaths", "PopSim" and "Pohem" models are examples of case models.
  • Time-based modelsA time-based model is a model in which a population, together, is simulated over a certain time. The duration of the simulation is then defined as a run control parameter. The "CVMM" and "HIVMM" models are examples of time-based models.

Model content

Case-based models and time-based models are made up of Modgen symbols. It is these symbols and the relationships among them that control the simulation. A short description of each symbol type is given here. For a more detailed description, please refer to the Modgen 12 Developer's Guide.

Actors

First, Modgen models simulate the lives of actors. An actor may be any entity the model developer wishes to simulate - a person, a dwelling, a pension plan, etc.

Actors are defined by their:

  • States
  • Events
  • Functions
  • Hooks
States

States describe actors' characteristics. There are two major kinds of states in Modgen models:

  • Simple states
  • Derived states
Simple states

Simple states are states whose values are not maintained automatically by Modgen. Rather, the values of simple states are changed in the code created by the model developer. Simple states should only be changed inside an event or a function called inside an event.

Example:

int age_int ;	//EN Integer age

Simple states may have an initial value. To distinguish them from derived states, initial values are assigned to simple states using curly brackets.

Example:

int age_int = {0} ;	//EN Integer age 
Derived states:

Derived states are states whose values are given as an expression in the declaration. Modgen maintains the value of these states throughout the simulation. Normally, these states are derived from other states, which explains their name.

Example:

//EN Year
	MODELED_TIME modeled_year = COERCE(MODELED_TIME, year);

In addition to simple expressions like the one in the example above, there are also functions that model developers may choose to use to create derived states.

Example:

//EN Age at year start 
int	age_debut_annee = value_at_latest_change(annee, age_int); 
Type

Generally, states are declared to be of type:

  • Integer (int)
  • Real (float or double)
  • Logical (logical)
  • Time (TIME)
  • With a type created in the model (classification, range, partition)
Events

Events play a key role in Modgen models since it is by executing events that simulation is done. Each event is made up of a pair of functions:

  • a function to determine the time of the event
  • a function to determine the consequences of the event

Event times are updated as needed. When a time must be recalculated, Modgen re-evaluates the value of the event time function. It assumes that event times must be recalculated when:

  • one of the states used in an event time function changes its value
  • the event occurs.

An event time function should not affect the actors. That implies that states can never be changed inside an event time function. The Modgen precompiler ensures that that does not happen.

The opposite is true for event functions. Functions that determine the consequences of events are in fact what are executed when events occur. It is inside event functions that states are changed. States should never be changed outside an event or a function called by an event function once initialization has been completed. Again, Modgen ensures that that does not happen.

Functions

Functions that are members of an actor may be used to generalize sections of code that will be used in one or more events. In general, a function should be created when a section of code will be used in several places. In that case, using a function simplifies maintenance since when the code has to be changed, it need only be changed in one place. Not using functions in those cases increases the risk of introducing bugs. Sometimes, we might want to use a function even if the code is not reused, either because it might be reused eventually or to increase modularity.

In addition to functions created by the model developer, there are two functions that actors must have. These are the "Start" and "Finish" functions.

"Start" function

There must be a "Start" function for each actor. It is called once for each actor when the actor is created. It contains the actor's initialization, including the initialization done by Modgen. Note that a state initialized in the "Start" function neither affects tables nor derived states such as "entrances()". For that reason, it is better to do those initializations in the "Start" function and not in the function that creates the actor. If the model developer has no initialization to do and does not include the "Start" function, Modgen creates a function containing only the initialization of states to the default values set by Modgen.

Example 1:

actor Personne  //EN Person
{
	void Start();	//EN Starts the person actor
};
void Personne::Start()
{
}

Example 2:

actor Personne  //EN Person
{
	//EN Starts the person actor
	void Start( int nObs, logical lImmigrant, double dTempsDebut, 
		logical lEnfant, Personne *prMere ); 
};  

/*NOTE(Personne.Start, EN)
	Function which starts the person actor.
*/
void Personne::Start( int nObs, bool bImmigrant, TIME tTempsDebut, bool bEnfant) 
{
	immigre_apres_recensement = bImmigrant;
	num_recense = nObs;

	ne_apres_recensement = bEnfant;

	time = CoarsenMantissa(tTempsDebut);
...
	vivant = TRUE;
	emigrant = FALSE;
   
	// Year 
	annee = (int) time;
};

In the example above, the states "vivant" and "emigrant" should have been initialized when they were declared. In general, states initialized in the "Start" function are states whose values are not constant but may depend, for example, on values given as arguments to the function.

"Finish" function

There is also a "Finish" function to contain everything that must be done when an actor is deleted. For example, we might want all actors to which an actor is linked to be deleted also. If the model developer has nothing in particular to do before an actor is deleted and omits the "Finish" function, Modgen automatically creates a basic function. Note that Modgen deletes all links to other actors when the "Finish" function is called but does not call those actors' "Finish" functions.

Example:

actor Personne  //EN Person 
{
	void Finish();	//EN Ends Person actor 
};  

void Personne::Finish()
{
    // Empty for now
}
Hooks

Hooks are used to link functions to events or to other functions. This makes it possible to divide events or functions into different sections belonging to different models.

In particular, many hooks are used with the "Start" and "Finish" functions to do initialization related to a module inside this module itself. By default, all hooks are inserted at the end of functions. If we want all hooks to be inserted elsewhere, we can do this by defining "IMPLEMENT_HOOK" at the appropriate place in the function.

Here is an example where hooks are inserted at the start of the function:

/* NOTE(Person.Finish,EN) 
Function which cleans up the actor once finished.
*/
void Person::Finish()
{
	Person *prChild = {NULL};
	int nIndex = {0};

	IMPLEMENT_HOOK();
...
	if ( tentative && sex == FEMALE ) 
	{
		mlChildren->FinishAll();
		mlChildrenAtHome->FinishAll();
		mlBiologicalChildren->FinishAll();
	}
}

Note that in the case of the "Finish" function, it makes more sense to have the hooks at the start of the function since the actor's properties are no longer valid at the end of it.

It can also happen that we want to insert hooks at a different place from the others. In that case, we create an intermediate function to which hooks are attached and call that function at the proper place.

Example:

actor Person	//EN Core individual
{
	void Start( ...);	//EN Function which initializes Person actor
	void StartClockHere();	//EN Start Clock dummy function
};

/* NOTE(Person.StartClockHere,EN) 
Dummy function which allows the clock to be initialized prior to all other hooked functions and derived functions.
*/
void Person::StartClockHere()
{
}

void Person::Start( ... )
{
...
	time = birth;
	StartClockHere(); 
...	
	year_of_birth = year;
	month_of_birth = month_of_year;
	day_of_birth = day_of_month;
...	
}

In this example, the "StartClockHere" function will be executed immediately after the time is initialized. The other "Start" function hooks are only executed at the end of the function. This makes it possible for them to use clock states such as "year", "month", etc. to initialize other states. The "StartClockHere" function does not contain code but rather associated hooks:

actor Person	//EN Core individual
{
	//EN Initialize the clock at birth of actor
	void StartClock();

	hook StartClock, StartClockHere;
};

It is the "StartClock" function, defined in another module, which contains the code that initializes the clock and its states such as "year", "month", etc.

Actor sets

Actor sets are collections of actors maintained dynamically by Modgen. This means that membership in an actor set is handled by Modgen based on criteria provided by the model developer. Actor sets can be multidimensional. Dimensions of actor sets are specified using states of the actors. For each combination of values of the dimensions of an actor set, subsets of actors are created. Modgen automatically ensures that each actor can only belong to one such subset. To find out more about actor sets, you can refer to the Modgen Developer's Guide.

Links

Links define relationships among actors of the same or different types. There can be links between one actor and a single other actor, one actor and several actors, or several actors and several actors. In all cases, Modgen automatically creates and maintains reciprocal links. This means that a model's code need only assign one of the links and the reciprocal link will be altered to reflect the change.

Examples of links:

// Declare the one-to-many link for family membership
link
	Person.lFamily		//EN Family
	Family.mlMembers[]	//EN Family members 
;

// link between children and parents 
link 
	Person.mlChildren[]	//EN Children
	Person.mlParents[]	//EN Parents 
;

// link between person and spouse
link Person.lSpouse;	//EN Spouse

For more information on links, you may refer to the Modgen 12 Developer's Guide.

Parameters

Parameters give model users a certain amount of control over simulations. They should be used everywhere the developer wishes to give control of a model to the user. Allowing model users to specify hazards to control various aspects of a simulation, for example, gives them an opportunity to explore different scenarios.

There are two main kinds of parameters:

  • parameters
  • model-generated parameters
Parameters

This is the most frequent kind. Parameters are used when users are to enter values. Parameter values are found in .dat files. Users thus control the values of these parameters directly.

Example:

parameters 
{
	//EN total mortality hazards (m total) 
	double MortHazard[MODELED_AGE][SEX];
};
model-generated parameters

These are parameters that must be derived, often from other values given in parameters. For each model-generated parameter, there must be corresponding code in the "PreSimulation" function, as illustrated in the following example.

Example:

parameters 
{
	//EN Residual mortality hazards
	model_generated double 
		ResidualMortHazard[MODELED_AGE][SEX];
};

Tables

Modgen models produce output in the form of tables. There are two different table types:

  • tablesA model's cross-tabulated tables generate a simulation's aggregated results.
  • user tablesThese tables are misnamed. It is not model users who are referred to but Modgen users - in other words, model developers!User tables allow model developers to create code to calculate the value of each cell. For each user table, code must be placed in a "UserTables" function to control the table's contents. Typically, this code reads and combines values from various tables to create other tables.

Table and parameter groups

Modgen allows some symbols to be grouped together. These groups are mainly used to group symbols in model interfaces but are also described in the model's help generated by Modgen. There are three kinds of symbols that can be grouped:

  • model parameters
  • model-generated parameters
  • tables and user tables

Membership of a symbol in a group must be specified in the group's declaration. Note that a group itself can contain another group of the same kind, thus giving the model developer the ability to create a hierarchy.

Example of a parameter group:

parameter_group DEMOGRAPHY  //EN Demography parameters 
{
	HasardMortalite, HasardNatalite, AgeMaximal, ProbabiliteGarcon, VariationFecondite
};

Example of a table group:

table_group DemographyTables { //EN Demography tables 
    BirthRates,
    Population,
    Population5Year
};

Types

Type symbols are support symbols. They are not essential to simulations since they do not play any specific role in the running of a model. However, they are everywhere since they are used to define other symbols such as states and parameters. They are also used to give labels to tables. There are three kinds of type symbols:

  • Classifications
  • Ranges
  • Partitions
Classifications

A classification is a set of levels or categories with their labels.

Example:

classification LANGUE2 
{
	//EN francophone
	L2_FRANCO,
	//EN non francophone
	L2_NON_FRANCO
};
Ranges

These are integer number ranges.

Example:

range AGE_NATALITE {15, 49} ;	//EN Age for birth rates
Partitions

Set of boundary points for continuous (or discrete) state variables.

Example:

//EN Age groups 
partition GROUPE_AGE 
{ 
	5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 
	80, 85, 90, 95, 100 
};

The descriptions presented here are taken from the Modgen 12 Developer's Guide. You can refer to it for a more complete description.

Course of a model simulation

In addition to the symbols contained in them, models are also made up of sets of general functions that control their execution. The process described here is for that of a case-based model. However, most of what is described here also applies to time-based models. Note that this section does not deal with the issue of parallel processing that is incorporated into Modgen.

Parameter reading

The first step of a case or time-based model simulation is to read the parameters. It is not necessary to create a global function in the model code to manage parameter reading or interactive modification. That is provided by Modgen.

Parameter validation

Modgen includes functions that allow developers to validate and change the values of the parameters provided by users. For more information on those functions, you can refer to the Modgen 12 Developer's Guide.

Presimulation

After the parameters are read, the presimulation phase begins. This phase will not do anything at all unless the model developer has defined one or more "PreSimulation" functions. Note that it is possible to create more than one "PreSimulation" function to increase modularity. In a future version of Modgen, it will even be possible to specify the execution order of a model's "PreSimulation" functions.

Normally, "PreSimulation" serves to calculate the model-generated parameters. At this stage, there is still just one simulation thread so the model-generated parameters can be changed without there being any risk of conflict between the different simulation threads. Moreover, a "PreSimulation" function cannot change actors since they have not yet been created.

Example:

parameters 
{
	//EN total mortality hazards (m tot) 
	double MortHazard[MODELED_AGE][SEX];

	//EN cause-specific population death hazards for disease X (m(x))
	double MortHazardX[MODELED_AGE][SEX];

	//EN cause-specific population death hazards for disease C (m(c))
	double MortHazardC[MODELED_AGE][SEX];

	//EN Residual mortality hazards
	model_generated double 
		ResidualMortHazard[MODELED_AGE][SEX];
};

void PreSimulation()
{
	int nAge = {0};
	int nSex = {0};
	int nTime = {0};

	for (nAge = 0; nAge < SIZE(MODELED_AGE); nAge++)
	{
		for (nSex = 0; nSex < SIZE(SEX); nSex++)
		{
			// Calculate residual mortality hazard
			ResidualMortHazard[nAge][nSex] = 
				MortHazard[nAge][nSex] 
					– MortHazardC[nAge][nSex] 
					– MortHazardX[nAge][nSex];
		}
	}
}

Simulation

Once the presimulation has ended, the simulation phase begins. The model must contain a "Simulation" function. Typically, for case-based models this will simply be a loop calling the "CaseSimulation"function for each model case.

Example:

void Simulation()
{
	long  lCase = 0;
	for ( lCase = 0; lCase < CASES() && !gbInterrupted && !gbCancelled && !gbErrors; lCase++ )
	{
		StartCase(); 
		CaseSimulation(); 
		SignalCase();
	}
}

The "Simulation" function is more complicated for time-based models since it also has to create an initial population. It must also contain the events loop found in the "CaseSimulation" function described below for case-based models.

Note that this function is included in models created using a Modgen model creation wizard.

CaseSimulation

In case-based models, Modgen does not require a function called "CaseSimulation". However, using such a function simplifies the code. This function is responsible for the simulation of one case and starts by creating the case's starting actor. It is inside this function that the events loop controlling the simulation of the case should be found.

Example of an events loop:

// event loop for current case 
while ( !gpoEventQueue->Empty() ) 
{
		
	if ( gbCancelled || gbErrors)
	{
		// in case of errors, close case and destroy all actors 
		gpoEventQueue->FinishAllActors();
	}
	else 
	{
		// advance time to next event 
		gpoEventQueue->WaitUntil( gpoEventQueue->NextEvent() );
	
		// execute next event 
		gpoEventQueue->Implement();
	}
}

This loop is at the heart of model simulation and should only be changed with great care. Note also that it is included in models created by one of Modgen's model creation wizards.

In fact, this loop controls the Modgen simulation engine. Within the simulation engine, there is an events queue that maintains the wait time for each model event. This queue is ordered in such a way that events are executed in a specific order. The simulation engine within Modgen executes the events in the following order:

  • The event with the smallest wait time is executed first
  • If times are the same, the event with the highest priority is executed first
  • If their priorities are the same, events are executed in alphabetical order
  • If their names are also the same, the event of the actor first created in the simulation is executed first.Note that events with the same name have to be associated with different actors, since a given event can only be found once for each separate actor in the events queue.

PostSimulation

If needed, model developers may also define a "PostSimulation" function. If it is defined, this function is run after the simulation phase but before data is tabulated. This function is only rarely used. However, if global variables are used in a model, it might be useful to reinitialize them in this function. Then, if the user restarts a simulation without shutting the model down, the global variables are initialized to the same values as in the first simulation.

Tabulation

The last phase produces the tables. This expression is however misleading since tabulation takes place in Modgen as the model runs. In this phase, the tables produced during that simulation run are simply saved in the output database.

However, user tables are calculated at this stage before being saved in the database. It is at this stage that the "UserTables" functions, created by the model developer to calculate the values in the user tables, are executed.

Date modified:

Appendix: Creating a new Modgen database file

Overview

If you want to make a biography using different actors and/or states, you will need to create a new database using Modgen.  The creation of the database file occurs with the execution of a Modgen simulation model and, therefore, requires some knowledge of the Modgen simulation environment.  The name, location, and contents of this special file are controlled as follows:

  • the filename, its location within the directory structure, and the number of simulated lifetimes contained within the database file are defined through a Modgen scenario and its Scenario Settings dialog box.  Modgen manages model run outputs by using the scenario name followed by a bracketed identifier indicating the type of Modgen output and an extension indicating the type of file. For tracking outputs, the convention used is: scenario_name(trk).mdb.

  • the set of actor/states to be displayed by BioBrowser must first be listed in the model variable trackingfacility within the Modgen source code for the specific simulation model. This set including the tracking filter for each actor are required by the model at compile time and can not be dynamically set at run time.

Scenario settings

There are a variety of Modgen scenario settings which determine the nature of each specific simulation.  The scenario settings are discussed in complete detail in the Modgen User’s Guide.  However, there are three main settings which must be understood in order to create an appropriate database file for the Biographical Browser.

  • In the Scenario/Settings/General tab, select "MS Access tracking". This setting tells Modgen that, in addition to the other outputs specified, a tracking database file is requested as output for this model run.
  • For case based models, select the number of cases in the Scenario/Settings/General tab. As a general rule, it is important to specify a small number here since all of the simulated cases which meet the tracking filter will be included in the database file. The size of this file can become very large, very quickly depending on the number of cases and states included. In addition, variable tracking will slow the simulation down considerably.
  • For time based models, select the time-units in the Scenario/Settings/General tab.

The demonstration file accompanying this release, demo(trk).mdb, contains 20 cases (which for this model corresponds to 20 simulated lifetimes of a person actor whose dominant state is True). It was created from the Statistics Canada model, LifePaths.

Modgen model state tracking

The tracking facility provided by the Modgen language controls the type of actors to be included in the database file along with the list of states which are to be analyzed with BioBrowser.

The track command must be included in one of the .mpp files which contain the Modgen model code which define the simulation model.  If the analyst wishes to change the type of actors to be tracked, or change the complement of the states to be output, then the Model must be re-compiled.  This is discussed further in the Modgen Developer’s Guide.  This guide should also be consulted by readers who are unfamiliar with the concepts outlined in the syntax and examples which follow.

Syntax of the Track Command

                track actor_name [filter] { state_or_link , ... , state_or_link } ;

It is important to remember that only one track definition is allowed for each actor in the model.  In addition, the filter specifies if and when an actor’s states or links are to be output to the database file.

Example

In this example, the states which describe the person are only output to the database file when he/she is married or remarried (the dominant characteristic is discussed in the Modgen Developer’s Guide).  Therefore, if the person was married at age 25, divorced at age 40, and re-married at age 50, then the database file would only contain information on this person from the ages of 25 through 40; and 50 to the age at death.  There are nine states which describe the characteristics of the individuals stored in the database file.  The last two items of the track definition are not variables but links to other individuals associated with this person.  In this case, the database will contain information on the nine variables defined in the track command on the person’s spouse and children if they are present.

track Person
[ dominant || mar_status == MARRIED || mar_status == REMARRIED ]
{
      es_state,
      ed_level_ep,
      sex,
      dominant,
      employed,
      mar_status,
      marstat_legal,
      fte_earnings,
      children_at_home,
      lSpouse,
      mlChildren
};

Example

In this example the date of birth and sex states for a child actor are output to the database file once the child is established in a family (the tentative characteristic is discussed in the Modgen Developer’s Guide.

track Child [ !tentative ]
{
      date_of_birth,
      sex,
      mlParents
};

Contents of the database file

The MS Access file produced by a Modgen simulation contains a data table called History and the following dictionaries organized hierarchically:

ActorDic - actor dictionary.
ActorStateDic - tracked state dictionary.
ActorLinkDic – tracked linked actor dictionary.
TypeDic - state type dictionary containing a pointer to a specific dictionary below.

The dictionaries for each state type are as follows:

SimpleTypeDic - for simple state types e.g.integer, double, float.
LogicalDic - for logical states.
ClassificationDic and ClassificationValueDic - for classification states and their values.
RangeDic - for range type states.
TypeDic - for linked actors.

Any dictionary with textual identifiers will contain 1 record for each language implemented in the specific Modgen model, as indicated by LanguageDic.

In addition, a file version table called VersionInfo will indicate the version of the tracking file. The BioBrowser uses this version to maintain backward compatibility if the tracking database design is changed.

The History table contains one record for each tracked state at one point in time. A record is added to this table, each time a tracked state changes for each tracked actor. The History table fields used by BioBrowser are: an object identifier, time, a state identifier, and a value for the given state.  The ShowData window contains a subset of records from this table. Since the object identifier is common across all records in the ShowData window, its value is shown in the window caption.

Date modified:

Customized products and services

Statistics Canada offers a variety of customized products and services to serve clients in Canada and around the world.

Search box

Professional services

Our professionals will assist you in identifying your information needs and product portfolio and recommend the most appropriate Statistics Canada products and services in line with your budget. This service includes custom data, maps, research and analysis.

Consulting services

Statistics Canada's Statistical Consultation Group offers consultation, project management services, and statistical training services to government departments and agencies, public and private sector institutions, in Canada and abroad.

Custom surveys

Learn more about Statistics Canada's custom survey services and discover how we can help you with your survey.

Web data services

Statistics Canada has developed a Web Data Service that provides access to data and metadata that we release each business day. This is a good option for developers who want to consume a discrete amount of data points updates to Statistics Canada data.

Contact us

Statistics Canada is committed to serving our clients in a prompt, reliable, courteous and fair manner. To this end, the agency has developed standards of service to the public that employees observe in serving our clients.

Request a customized product or service

General enquiries

Date modified:

More about variables: How to interpret the information about variables

"A variable is a characteristic of a statistical unit being observed that may assume more than one of a set of values to which a numerical measure or a category from a classification can be assigned."

In the above definition, the key components are:

  • statistical unit being observed,
  • characteristic,
  • numerical measure, and
  • category from a classification.

These components constitute the standard components used in this information package to name and structure variables. A statistical organization publishing data has to adopt a standard way to name and structure the variables to which the data relate. From the point of view of the users, they must be able to recognize the same structure underlying the name of variables whichever sub-division of the organization is producing the data and whatever the subject area being studied. From the point of view of the management of the information about the data (referred to as metadata) published by the organization, it is necessary to adopt a standard naming convention and structure for the variables in order to efficiently store the metadata in a central database, allow efficient extraction, and permit efficient search by users.

The naming convention and structure referred to above are adapted from the International Standards Organisation (ISO) standard; Information technology - Metadataregistries (MDR) or ISO 11179. This standard is being adopted by an increasing number of National Statistical Organizations.

How the structure is applied

When it is decided that a statistical program will produce data to illuminate a certain subject area, the responsible analysts have to determine:

  • which statistical unit(s) will be observed, e.g. persons or households, etc. in the case of a social statistics program, or business establishments or enterprises in the case of a business statistics program; then,
  • which characteristics of these statistical units will be measured, e.g. revenue or expenses, etc., and some times, the actual occurrence of the statistical unit (e.g. count of persons, in which case the characteristic measured is for the statistical unit the state of existing); then
  • most often, the statistical program will produce data for more than just the total of the units beingobserved, and for the global characteristic being considered; the program will probably produce data forsub-categories of the statistical unit, and for sub-categories of the characteristic considered. For example, in the case of income of households, data is produced for different categories of revenues, e.g. wages, pensions, etc.; as well, data is produced for different categories of households, e.g. households with one income earner, with two income earners, etc. These categories are what statistical organizations call 'classes within classifications'. For coherence of the data published by the various sub-divisions of a statistical organization, and even by different statistical organizations, standard classifications are created. These usually comprise the most frequently used categorizations of characteristics and observation units. For example, the three North American countries have developed the North American Industry Classification System (NAICS) in order to publish data for the same sub-categories of industry of statistical units. Finally, the analysts have to decide
  • which unit of measure will be used to express the numerical values, e.g. in the case of income, it could be current Canadian dollars, constant 1997 dollars, etc.

How to read time series statistical tables using the ISO components

Imagine a time series table, applying to Canada, where the headings of the columns consist of the reference periods and the headings of the rows contain the name of the general characteristics being measured for the statistical unit being observed, e.g. "Total income of all Households". The documentation of variables you are consulting defines the characteristic being measured and the statistical unit being observed. The cells along the rows contain the numerical values using the unit of measure indicated in the documentation of the variables.

In most cases, the data in the table will be broken down by geographic areas within Canada, e.g. provinces and territories, or Canadian Metropolitan Areas, etc. The variable documentation informs the users of this geographic breakdown. In most cases, the value of the general characteristic being measured will be broken down by sub-categories of the characteristic and/or of the statistical unit as well or in other words by classes within classifications, e.g. classes of income sources, or classes of industries. The variable documentation always informs the users on the different classes of the specific classification(s) used to detail the data in the table. The names of these classes and groups of classes appear as the headings of rows in the table.

Date modified: