Statistics Canada
Symbol of the Government of Canada

Calculating the mean

Archived Content

Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please contact us to request a format other than those available.

The mean of a numeric variable is calculated by adding the values of all observations in a data set and then dividing that sum by the number of observations in the set. This provides the average value of all the data.

Mean = sum of all the observation values ÷ number of observations

There are two types of variables—discrete and continuous. Discrete variables are defined as variables that cannot be divided internally. For example, a hockey player can score 1 or 2 goals, but never 1 and a half goals. Continuous variables, however, can be divided into smaller units. A student's age can be 11 years, 7 months and 3 days, as opposed to just 11 or 12 years.

It is important that you understand the difference between these two types of variables, so that you can properly calculate the mean in any given situation. The following examples use discrete variables to calculate the mean

Example 1 – Soccer tournament at Mount Rival I
Example 2 – Traffic fatalities
Example 3 – Soccer tournament at Mount Rival II
Example 4 – Height of 50 Grade 10 girls
Résumé

 

Example 1 – Soccer tournament at Mount Rival I

Mount Rival hosts a soccer tournament each year. This season, in 10 games, the lead scorer for the home team scored 7, 5, 0, 7, 8, 5, 5, 4, 1 and 5 goals. What was the mean score?

Mean = sum of all the observed values ÷ number of observations

= (7 + 5 + 0 + 7 + 8 + 5 + 5 + 4 + 5 + 1) ÷ 10
= 47 ÷ 10
= 4,7

 

Therefore, in the 10-game tournament, the player scored an average of 4.7 goals per game. The average of 4.7 is not a whole number so it only has meaning in a statistical sense. In reality, it is impossible to score 4.7 goals, even if you are a top scorer.

The mathematical notation to calculate the mean for a discrete variable is as follows:

ou
Sommation de valeurs
x ÷ n

where x stands for an observed value,

n stands for the number of observations in the data set,

mathematical symbol for sum
x stands for the sum of all observed x values, and

mathematical symbol for the mean
stands for the mean value of x.

[an error occurred while processing this directive]

Example 2 – Traffic fatalities

The following table lists the number of people killed in traffic accidents over a 10 year period. During this time period, what was the average number of people killed per year? How many people died each day on average in traffic accidents during this time period?

Table 1. Number of fatalities in traffic accidents
Year Fatalities
1
959
2
1,037
3
960
4
797
5
663
6
652
7
560
8
619
9
623
100
583

Using the formula to calculate the mean for discrete variables, you can see that:

Mean =
Sommation de valeurs
x ÷ n

= (959 + 1 037 + 960 + 797 + 663 + 652 + 560 + 619 + 623 + 583) ÷ 10
= 7,453 ÷ 10
= 745.3

The average number of people killed per year is 745.3.

To calculate the daily death rate from traffic accidents, the average yearly death rate is divided by the number of days in a year (leap years are ignored).

= 745.3 ÷ 365
= 2.0

Therefore, on average, 2 people died each day in traffic accidents.

Frequency tables

A frequency table lists the number of observations that lie in any given data set. It can be used with grouped or ungrouped variables.

For example, to provide a frequency table of the age of people in a data set, you can produce a table using the exact age (ungrouped), or you can group the ages (grouped).

An ungrouped variable can be regarded as being a special type of grouped variable (i.e., a group). You can calculate the mean of a discrete variable using a frequency table. This method provides an approximation of the true mean for an ungrouped variable. How accurate the approximation is depends on how evenly the observed values are spread within each group.

[an error occurred while processing this directive]

Example 3 – Soccer tournament at Mount Rival II

Grouping observations in tables is useful when dealing with a large amounts of data. The goal-scoring figures from the soccer tournament example can be displayed in a frequency table.

Table 2. Mount Rival soccer tournament, frequency of goals for lead scorer
Number of goals (x) Frequency (f) Total number of goals (xf)
0
1
0
1
1
1
4
1
4
5
4
20
7
2
14
8
1
8
Total (
mathematical symbol for sum
)
10
47

Because the observations are grouped, the mathematical notation changes slightly.

For a discrete variable in a frequency table, the mean is calculated as follows:

Formule pour la calcul de la moyenne en utilisent un tableau de fréquence.
ou
Sommation de valeurs
xf ÷
Sommation de valeurs
f

  • where x stands for an observed value,
  • xf stands for the product of an observed value, multiplied by its frequency,
  • mathematical symbol for sum
    xf
    stands for the total of all xf values,
  • mathematical symbol for sum
    f
    stands for the total of all frequencies, and
  • mathematical symbol for the mean
    stands for the mean value of x.

 

The calculation for the mean of the player's goals is:

Mean =
mathematical symbol for sum
xf ÷
Sommation de valeurs
f

= (0 + 1 + 4 + 20 + 14 + 8) ÷ (1 + 1 + 1 + 4 + 2 + 1)
= 47 ÷ 10
= 4,7

Since the variable is ungrouped, this is the exact mean. The next example shows what happens when working with grouped variables.

[an error occurred while processing this directive]

Example 4 – Height of 50 Grade 10 girls

The following table shows the heights of 50 randomly selected Grade 10 girls. What is the mean height of the girls?

Determine the midpoint of each class interval for a variable before calculating the mean from a frequency table.

Table 3. Mean height of 50 Grade 10 girls
Height (cm) Midpoint (x) Frequency (f) Total amount of midpoint (xf)
150 –< 155
152.5
4
610.0
155 –< 160
157.5
7
1,102,5
160 –< 165
162.5
18
2,925.0
165 –< 170
167.5
11
1,842.5
170 –< 175
172.5
6
1,035.0
175 –< 180
177.5
4
710.0
-
-
50
8,225.0

The calculation is the same as that used in the soccer tournament example above, except that the xf is now the product of the midpoint of the interval multiplied by the frequency of the same interval. This approximation is required because we do not know the exact height of each girl.

As a result, we must treat all of the heights as if they were midpoints for their interval. For example, because there are four girls in the interval of 150 –< 155 cm, we will treat each of the four girls as measuring 152.5 cm. As was mentioned in the soccer tournament example, the accuracy of the approximation of the mean will depend on how close each of the girls is to the midpoint of her interval.

Thus,

Mean =
mathematical symbol for sum
xf ÷
Sommation de valeurs
f

= (610.0 + 1,102.5 + 2,925.0 + 1,842.5 + 1,035.0 + 710.0) ÷ (4 + 7 + 18 + 11 + 6 + 4)
= 8,225.0 ÷ 50
= 164.5 cm

Therefore, the mean height of the 50 girls in Grade 10 is 164.5 cm.

[an error occurred while processing this directive]

Summary

The mean is used in computing other statistics (such as the variance) and does not exist for open-ended grouped frequency distributions. It is often not the most appropriate measure for skewed (unbalanced) distributions such as salary information. (See Measures of spread for more information on variance.)