Statistics Canada - Government of Canada
Accessibility: General informationSkip all menus and go to content.Home - Statistics Canada logo Skip main menu and go to secondary menu. Français 1 of 5 Contact Us 2 of 5 Help 3 of 5 Search the website 4 of 5 Canada Site 5 of 5
Skip secondary menu and go to the module menu. The Daily 1 of 7
Census 2 of 7
Canadian Statistics 3 of 7 Community Profiles 4 of 7 Our Products and Services 5 of 7 Home 6 of 7
Other Links 7 of 7

Warning View the most recent version.

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please "contact us" to request a format other than those available.

Skip module menu and go to content.

Statistical relationships between crime trends and major socio-economic trends1

The following section examines the statistical relationships between changes in crime rates and a number of macro-level socio-demographic and economic trends. The focus is on four major crime types over the period 1962 to 2003: homicide, robbery, break and enter and motor vehicle theft. These offences were chosen for time series analysis because they have been consistently reported to the Uniform Crime Reporting Survey (UCR) over time and are less likely than other types of offences to be subject to changes in legislation and police charging practices (as is the case with offences such as sexual assault, assault, drug offences and prostitution) or the reporting behaviour of victims. Furthermore, we postulate, as others have (Cantor & Land, 1985), that different crimes are influenced by different factors. In order to accommodate differences in the relationship between the independent variables and crime, specific crime types are examined as opposed to overall rates of violent crime or property crime.

The statistical method chosen for this study was time series analysis. This method has been employed by a number of other researchers to investigate the link between specific crime types, economic and demographic change (LaFree, 1999; LaFree et al, 1992; Cohen & Land, 1987). Time series analysis is used in this report because variables in the analysis are time-ordered (for example, trends in crime rates and unemployment rates). This method has the advantage of being able to take account of what happened in the preceding year or two as well as to make current year comparisons among variables. This allows the flexibility to address important questions, such as whether unemployment rates are correlated with crime rates in the current year or after a lag of a year or two.

A primary limitation of time series analysis is that it can include only those factors that have been measured and recorded annually over many time points. This necessarily eliminates the primary source providing statistical data on characteristics of the population, the Census, which is conducted every five years. There are some important exceptions, such as age and sex of the population, which are available annually through intercensal estimates. Table 5 lists the wide range of socio-demographic and economic indicators considered for this analysis, as well as the source and the time period for which they are available.

It is also important for time series analysis that all variables are available in an identical and lengthy time period. Unfortunately, potentially important measures such as lone parent families and a number of economic indicators, including low-inc ome, the percentage of families receiving employment insurance and social assistance are available only from 1980 onward. As a longer time series was needed to yield reliable estimates, analysis was restricted to those variables available for the years relatively consistent with UCR crime rates: 1962 to 2003. These variables are marked in Table 5 with an asterisk. This longer time series provides the maximum number of degrees of freedom and hence more robust time series models.

Description of variables
Time series methods
Multivariate results
Discussion

Description of variables

Crime rates

The primary source of information on crime trends in Canada is Statistics Canada's Uniform Crime Reporting (UCR) Survey. Since 1962, all police departments across the country have supplied the following summary statistics to the UCR Survey:

  1. criminal offences known to the police;
  2. unfounded offences (deemed not to be a crime following police investigation);
  3. actual criminal offences (those deemed founded);
  4. the number cleared by charge and cleared otherwise; and
  5. the number of adults and youth charged.

Socio-economic variables

The predictor variables in this analysis were selected on the basis of their relevance to the criminological explanations for crime summarized earlier in this report, as well as their availability in a time series to 1962. These include the age structure of the population, unemployment, inflation and per capita alcohol consumption.

Age structure of the population

The age composition of the population is one of the most prominent explanations for changes in crime rates. To test the relationship between crime patterns and age, the percentage of the population 15 to 24 and 25 to 34 years of age will be included in this analysis. Data for age are derived from the average of quarterly population estimates and the estimates of population in certain age groups used by the Labour Force Survey (LFS uses the estimates obtained from the Census). These estimates are adjusted for any under coverage and population growth.

Unemployment

Unemployment rates are derived from the Labor Force Survey. This survey covers approximately 98% of the population and excludes residents of the Yukon , Northwest Territories and Nunavut, persons living on Indian Reserves, full-time members of the Canadian Armed Forces and inmates of institutions. Unemployed persons are defined as those persons who were available for work and were either on temporary lay off, had looked for work in the past 4 weeks or had a job to start within the next 4 weeks. The unemployment rate excludes discouraged workers who are available to work but are no longer actively seeking employment.

Inflation

Inflation is derived from the Consumer Price Index (CPI) and is simply the year-over-year difference in the CPI expressed as a percentage of the previous year. Inflation occurs when there is an upward movement in the average level of prices.

Per capita levels of alcohol consumption

Per capita levels of alcohol consumption is based on disappearance of alcohol in Canada (expressed in litres) divided by the total population. Alcohol disappearance is derived from the Control and Sale of Alcoholic Beverages in Canada (Public Institutions Division). In the absence of long term data identifying drinking patterns among Canadian adults, alcohol disappearance is used as a proxy for alcohol consumption; consumption being defined as disappearance minus wastage. In the case of alcohol, the average annual wastage is quite low (3.5%) especially when compared to other food categories (e.g. the average annual wastage for fruits and vegetables is approximately 40%).

Time series methods

In this analysis we are exploring the extent to which changes over time in the dependent variable, the crime rate, can be explained by changes in independent variables, a selection of socio-economic indicators. These socio-economic indicators may, however, move in a similar way to the crime rate over time, but have no causal relationship with the crime rate. As a result, modeling with the variables as they are, using either simple correlation analysis or multiple regression techniques, could lead to a false conclusion that a causal relationship exists, when, in reality, there is none.

In technical terms, this problem exists because the crime rate and other socio-economic indicators to be included in the model are not stationary in the mean (average) or its variance over time. What this means is if, over the whole time series from 1962 to 2002, one took repeated samples of shorter time series for each variable, the variable's mean and its variance would be different across the samples.

Terminology

What is a logarithm?

Taking the logarithm of a variable is a common technique used on variables with a large range and a high variability among values. The log re-scales the values of the variable to help improve the statistical properties of the variable and therefore the properties of the time series results.

What is a correlation?

A correlation measures the linear relationship between two variables measured over a series of paired observations (in this case, years). Values of a correlation range from -1 to +1. A value of +1 indicates a perfect positive relationship (e.g. the variables move in the same direction) whereas a value of -1 indicates a perfect negative relationship (e.g. the variables move in opposite directions). A value of 0 indicates no linear relationship.

What is time series analysis?

A key statistic in time series analysis is the autocorrelation coefficient, which is the correlation of the time series with itself, lagged by 1 or more periods. The autocorrelation coefficient indicates how values of the variable in question relate to each other at zero lag, lag 1, lag 2, etc. Autocorrelation within the data means that some of the variance in the current value is explained by the history of the variable. For example, unemployment in 2004 is partially explained by unemployment in 2003, all things being equal. For this analysis, lag 1 refers to the past year, lag 6 refers to 6 years in the past, etc.

What is an ARIMA model?

ARIMA models are Autoregressive Integrated Moving Average models, a general model widely used in time series analysis. The technique is premised on investigation of the prior behaviour of a series and is also used to adjust for seasonality. ARIMA models are particularly beneficial if one is interested in forecasting future values to calculate new values of the series and confidence intervals for those predicted values. The estimation and forecasting process is performed on transformed (differenced) data and then the series needs to be integrated (integration is the inverse of differencing) so that the forecasts are expressed in values compatible with the input data. The integration feature givesthe order of differencing needed to achieve stationarity.

The first step in time-series modeling is to transform the variables to be included in the model in a way that reduces the risk of spurious or false correlations by creating a stationary mean and variance. Taking the logarithm of a variable is a common technique to transform variables to achieve this goal, particularly when the variables have a large range and a high variability among the values.

For these time-series models, the log of each variable to be included in the models was calculated and then the growth rate in the log values was calculated, resulting in a transformed data series for each variable. The only exception to this was for the rate of inflation, because the variable itself is a growth rate. For this variable, the log value was sufficient.

Using the transformed variables, bivariate or one-on-one models were constructed to determine which independent variables had a statistically significant relationship with the crime rate. Multivariate models were then constructed by testing different combinations of independent variables that had been significant in the bivariate models.

In any modeling exercise it is usually not possible to include within the model all of the variables that would be important to explaining why changes in the dependent variable, in this case the crime rate, have occurred. Error in the models that result from missing important variables is referred to as the "residual". While it is rare for models to eliminate error, to accurately interpret how significant the variables included in the model are to explaining changes in the crime rate over time, that is to avoid false or spurious results, it is important that this error or "residual" be random or white noise2.

In time series models,  autocorrelation coefficients are key statistics that measure whether the dependent variable, the crime rate, is correlated with itself, last years crime rate (lag of 1 period), the crime rate six years ago (lag6), or  twelve  years ago (lag12) etc. Autocorrelation within the data means that some of the variance in the dependent variable, the crime rate, is explained by the history of the crime rate itself. The presence of autocorrelation in the models results in residuals that are not random but that have a pattern to them.  Lag variables, that is the crime rate six years ago, 12 years ago, 18 years ago and 24 years ago are included in the time series models, in order to test for the presence of autocorrelation. Models where these lag variables are statistically insignificant pass the "white noise test", that is the residuals in the models are random or white noise.

Further, as the fit of the models was to be further tested by examining the models' ability to predict observed crime rates in 2002 and 2003,  the residuals themselves could contain information that tell us something about the movement in other important variables that are missing from the models. This information can help us to develop better models to predict future crime. Moving average (MA) terms were added to the models to capture any information in the residuals over time that could improve the models' predictive ability. As a result, the time series models developed for this analysis are ARIMA models (Autoregressive Integrated Moving Average models).

The results of all models were then compared with three criteria used to determine the models that "best fit" each crime type studied.

  1. Socio-economic variables included had statistically significant parameters
  2. Residuals were rendered random (white noise test was passed)
  3. Highest accuracy of the forecasts resulting from the models

Table 6 presents the "best fit" models for each crime type.

Multivariate Results

The following section presents the results of the time series models for each of the four crime types examined.

Homicide rates3

As shown in Table 6, results of the time series analysis indicate that over the past four decades, shifts in unemployment rates and alcohol consumption are associated with changes in homicide rates. When the growth rate in unemployment varies by 1% the growth rate in homicides varies by approximately 0.39% in the same direction. Also, when the growth rate in alcohol consumption varies by 1%, the growth rate in homicides varies by approximately 1.38% in the same direction. This model indicates that there is a positive relationship between homicide and unemployment rates and rates of per capita alcohol consumption such that when rates of unemployment increase (or decrease) there is a corresponding change in homicide rates in the same direction. Similarly, when rates of per capita alcohol consumption increase (or decrease) there is a corresponding change in rates of homicide in the same direction.

Financially motivated crimes4

Inflation — and not unemployment rates — was found to be associated with all "financially motivated" crimes examined: robbery, motor vehicle theft5 and break and enter. Results of the time series analysis indicate that when the inflation rate varies by 1%, the growth rate of robbery will vary by approximately 0.026% and the growth rate of motor vehicle theft will vary by approximately 0.019% in the same direction. That is to say, if inflation increases (or decreases) so too will rates of robbery and motor vehicle theft.

In this study, only rates of break and enter were found to be significantly affected by changes in both the age structure of the population and inflation rates. The time series model6 indicates that there is a positive relationship between rates of break and enter and the proportion of the population aged 15 to 24, such that when the growth rate of the population 15 to 24 years of age varies by 1%, growth rates of break and enter vary by approximately 1.67% in the same direction. When inflation varies by 1%, growth rates of break and enter varies by approximately .021% in the same direction.

Trends in crime revisited: predicting crime in 2002 and 2003

A further test of the validity of the models was to examine their capacity to predict crime patterns. An important question is whether the models for predicting crime patterns over the past several decades, in particular the decline during the 1990s, are equally effective in accounting for changes in 2002 and 2003.

The forecasting models developed for rates of homicide, robbery, break and enter and motor vehicle theft were statistically significant and accurate7. This was particularly the case for forecasting crime trends in 20038 (Table 7). Between 2002 and 2003 rates of homicide declined slightly, while rates of robbery, break and enter and motor vehicle theft increased. The forecasting models presented in this paper would have accurately predicted a decline in homicide and increases in the other types of crimes examined, although to a greater extent than observed.

Forecasting crime trends in 2002 was slightly less accurate9 than forecasting crime trends in 2003, partially because additional degrees of freedom increase the accuracy of forecasting models (in this case it is equivalent to an additional year of data) (Table 8). Between 2001 and 2002 rates of homicide increased while rates of robbery, break and enter and motor vehicle theft declined slightly. The forecasting models presented in this paper would have accurately predicted the observed decrease in motor vehicle theft and the observed increase in rates of homicide. On the other hand, the forecasting models would have predicted increases in rates of robbery and break and enter when in fact they declined.

Discussion

The greatest gains in reducing crime rates in recent years were made in property crimes, especially among young offenders. Significant declines were also noted for robberies and homicides with firearms as well as homicides overall. This study suggests there are relationships between crime rates and trends in other major socio-economic indicators, including inflation rates, population shifts, unemployment rates and per capita rates of alcohol consumption, and demonstrates the value of using time series analysis to examine these relationships. The results can be interpreted to mean that years in which certain social problems occur with greater frequency also tend to have higher rates of crime. In this study, years with higher rates of inflation tended to have higher rates of financially motivated crimes (robbery, break and enter, motor vehicle theft), while years with higher rates of per capita alcohol consumption and unemployment tended to have higher rates of homicide.

This study also stresses the importance of including inflation as a macroeconomic indicator of economic health. In this study, shifts in inflation rates - not unemployment rates - were associated with the financially motivated crime types examined. According to Devine et al. (1988), both unemployment and inflation critically shape macroeconomic and social welfare policies and therefore both indicators should be included in any macro-level analysis of crime. Results from this analysis support this argument.

Furthermore, these results suggest an unexpected consequence of the Bank of Canada's monetary policy, which aimed at keeping inflation rates around 2% throughout the 1990s. Readers will recall that inflation in Canada rose significantly in the 1970s and early 1980s and then declined by 1984 and again after 1991. Due to high inflation rates in the 1970s and early 1980s, in February 1991 Canada adopted inflation targets. Prior to these targets CPI inflation averaged 6% per year between 1981 and 1990 and 2 % per year between 1991 and 2000 (Longworth, 2002).

Previous research has found a positive relationship between crime rates and inflation (Devine et al, 1988; Long & Witte, 1981; Land & Felson, 1976). According to these researchers, there are a number of factors which contribute to the positive relationship between crime and inflation. During periods of high inflation, the price of goods relative to wages increases which results in a reduction of real income. This reduction in real income has a significant impact on persons on fixed or minimum wage incomes. Inflation also destroys confidence in existing institutions and fuels a general climate of uncertainty and fear about the future (e.g. interest rates for personal loans and mortgages are higher, unemployment rates are higher, etc.). Cantor and Land (1985) have argued that economic distress prompts an "upward shift in the density distribution of the population along the criminal-motivation continuum". In other words, in times of high inflation when there is a significant differential between the price of goods and wages and uncertainty about one's economic future is high, those located at or near the motivational margin of legality may be more likely to cross the threshold into criminality. Furthermore, as Devine et al (1988) point out, inflation rewards property criminals due to the rising demand of goods and subsequent real profits in the illegal goods market.

Finally, our results appear to support the contention that shifts in the age composition of the population is only one of many factors contributing to the overall crime drop (Steffensmeier & Harer, 1999; Levitt, 1999). Keeping in mind that only four crime types were examined in this study, shifts in the relative proportion of at-risk age groups in the population 15 to 24 years of age were found to be associated with shifts in rates of break and enter and were not significant for the other types of crimes studied. Furthermore, the effects of the population 25 to 34 were neutralized when the effects of unemployment, inflation and per capita alcohol consumption were controlled. This finding suggests that age can have a significant association depending on the type of crime being examined, however it also suggests that other factors may offset the effects of a change in the age composition (or profile) of the population.


Notes

1. An important question is the extent to which the results and conclusions of the multivariate analysis at the national level are equally applicable in each of the provinces. However, there are many limitations to modeling crime rates at the provincial level related to availability of data that do not apply to the national level. Data for the independent variables selected for this study are available but for a shorter timeframe. For example provincial inflation rates are available from 1980 onward, and a measure of inter-provincial migration is available only from 1972.

2. In the case of time series, errors will themselves constitute a time series. One usually aims for the errors to be devoid of any structure, although they may be correlated. However, if one can extract the correlation in the errors then one ought to be left with a residual series with no correlation (or structure). Such a series is referred to as a white noise.

3. The time series model for homicide is:
ΔlogHom(t) = 0.39 Δlogunemp (t) + 1.38 ΔlogAlcohol (t)+Z(t).

4. The time series model for robbery is:
ΔlogRob(t) = 0.026logInf(t) + Z(t) + 0.37Z(t-1).
The time series model for motor vehicle theft is:
ΔlogMotor (t) = 0.0185 logInf (t) + Z(t) + 0.4676Z(t-1) + 0.2367Z(t-5)- 0.3551Z(t-8) - 0.4703Z(t-9).

5. Available data from 22 large police services (accounting for almost three-quarters of all police-reported vehicle thefts in Canada ) indicate that approximately one out of every five stolen vehicles were not recovered in 2002 (Wallace, 2004a). Therefore, approximately one in five motor vehicle thefts may be linked to organized groups or theft rings. This is a large increase over the early 1970s when approximately 2% of all stolen vehicles were not recovered. Based on Wallace's (2004a) analysis, it could be inferred that the large majority of motor vehicle thefts are not "financially motivated", however there is an element of financial gain when organized crime is involved and also when the vehicle is used for transportation or to commit another crime.

6. The time series model for break and enter is:
Δ0.0211logInf(t) + 1.6736 ΔlogPop15(t) + Z(t) + 0.2899Z(t-1) - 0.5348Z(t-9).

7. When conducting time series analysis at Statistics Canada the acceptable limit for forecasting error is 15% or less. The forecasting models presented in this paper meet this criterion.

8. Errors percentages were calculated for each of the crime types examined by subtracting the observed crime rate in 2003 from the forecasted crime rate in 2003 divided by the observed crime rate in 2003 multiplied by 100. For example, the observed homicide rate in 2003 was 1.7 per 100,000 population. The ARIMA model developed for homicide would have predicted a rate of 1.8 per 100,000 population. Therefore the forecasting error is ((1.8-1.7)/1.7)*100 or 7.64%.

9. Errors percentages were calculated for each of the crime types examined by subtracting the observed crime rate in 2002 from the forecasted crime rate in 2002 divided by the observed crime rate in 2002 multiplied by 100. For example, the observed robbery rate in 2002 was 85 per 100,000 population. The ARIMA model developed for robbery would have predicted a rate of 90 per 100,000 population. Therefore the forecasting error is ((90-85)/85)*100 or 5.88%.


Home | Search | Contact Us | Français Top of page
Date modified: 2005-06-29 Important Notices
Online catalogue 85-561-MWE Online catalogue - Exploring Crime Patterns in Canada Main page Background Findings Tables, figures and maps Methodology Bibliography More information PDF version Previous issues of the Crime and Justice Research Paper Series