Information identified as archived is provided for reference, research or recordkeeping purposes. It is not subject to the Government of Canada Web Standards and has not been altered or updated since it was archived. Please contact us to request a format other than those available.

- Example 1 – Discrete variables
- Example 2 – Continuous variables
- Other cumulative frequency calculations

*Cumulative frequency* is used to determine the number of observations that lie above (or below) a particular value in a data set. The cumulative frequency is calculated using a frequency distribution table, which can be constructed from stem and leaf plots or directly from the data.

The cumulative frequency is calculated by adding each frequency from a frequency distribution table to the sum of its predecessors. The last value will always be equal to the total for all observations, since all frequencies will already have been added to the previous total.

Variables in any calculation can be characterized by the value assigned to them. A *discrete variable* consists of separate, indivisible categories. No values can exist between a variable and its neighbour. For example, if you were to observe a class attendance registered from day-to-day, you may discover that the class has 29 students on one day and 30 students on another. However, it is impossible for student attendance to be between 29 and 30. (There is simply no room to observe any values between these two values, as there is no way of having 29 and a half students.)

Not all variables are characterized as discrete. Some variables (such as time, height and weight) are not limited to a fixed set of indivisible categories. These variables are called *continuous variables*, and they are divisible into an infinite number of possible values. For example, time can be measured in fractional parts of hours, minutes, seconds and milliseconds. So, instead of finishing a race in 11 or 12 minutes, a jockey and his horse can cross the finish line at 11 minutes and 43 seconds.

It is essential to know the difference between the two types of variables in order to properly calculate their cumulative frequency.

The total rock climber count of Lake Louise, Alberta was recorded over a 30-day period. The results are as follows:

31, 49, 19, 62, 24, 45, 23, 51, 55, 60, 40, 35 54, 26, 57, 37, 43, 65, 18, 41, 50, 56, 4, 54, 39, 52, 35, 51, 63, 42.

- Use these discrete variables to:
- set up a stem and leaf plot, (see the section on stem and leaf plots) with additional columns labelled
*Frequency, Upper Value*and*Cumulative frequency* - figure out the frequency of observations for each stem
- find the upper value for each stem
- calculate the cumulative frequency by adding the numbers in the
*Frequency*column - record all the results in the plot

- set up a stem and leaf plot, (see the section on stem and leaf plots) with additional columns labelled
- Plot a graph using the y-axis (or vertical line) for the cumulative frequency and the x-axis (or horizontal line) for the number of people rock climbing.

- The number of rock climbers ranges from 4 to 65. In order to produce a stem and leaf plot, the data are best grouped in class intervals of 10.
Each interval can be located in the

*Stem*column. The numbers within this column represent the first number within the class interval. (For example,*Stem*0 represents the interval 0–9,*Stem*1 represents the interval of 10–19, and so forth.)The

*Leaf*column lists the number of observations that lie within each class interval. For example, in*Stem*2 (interval 20–29), the three observations, 23, 24, and 26, are represented as 3, 4 and 6.The

*Frequency*column lists the number of observations found within a class interval. For example, in*Stem*5, nine leaves (or observations) were found; in*Stem*1, there are only two.Use the

*Frequency*column to calculate cumulative frequency.- First, add the number from the
*Frequency*column to its predecessor. For example, in*Stem*0, we have only one observation and no predecessors. The cumulative frequency is one.

1 + 0 = 1 - However in
*Stem*1, there are two observations. Add these two to the previous cumulative frequency (one), and the result is three.

1 + 2 = 3 - In
*Stem*2, there are three observations. Add these three to the previous cumulative frequency (three) and the total (six) is the cumulative frequency for*Stem*2.

3 + 3 = 6 - Continue these calculations until you have added up all of the numbers in the
*Frequency*column. - Record the results in the
*Cumulative frequency*column.

The

*Upper value*column lists the observation (variable) with the highest value in each of the class intervals. For example, in*Stem*1, the two observations 8 and 9 represent the variables 18 and 19. The upper value of these two variables is 19.Table 1. Cumulative frequency of daily rock climber counts recorded in Lake Louise, Alberta, 30-day period Stem Leaf Frequency (f) Upper value Cumulative frequency 0 4 1 4 1 1 8 9 2 19 1 + 2 = 3 2 3 4 6 3 26 3 + 3 = 6 3 1 5 5 7 9 5 39 6 + 5 = 11 4 0 1 2 3 5 9 6 49 11 + 6 = 17 5 0 1 1 2 4 4 5 6 7 9 57 17 + 9 = 26 6 0 2 3 5 4 65 26 + 4 = 30 - First, add the number from the
- Since these variables are discrete, use the upper values in plotting the graph. Plot the points to form a continuous curve called an ogive.
Always label the graph with the cumulative frequency—corresponding to the number of observations made—on the vertical axis. Label the horizontal axis with the other variable (in this case, the total rock climber counts) as shown below:

The following information can be gained from either graph or table:

- on 11 of the 30 days, 39 people or fewer climbed the rocks around Lake Louise
- on 13 of the 30 days, 50 or more people climbed the rocks around Lake Louise

When a continuous variable is used, both calculating the cumulative frequency and plotting the graph require a slightly different approach from that used for a discrete variable.

For 25 days, the snow depth at Whistler Mountain, B.C. was measured (to the nearest centimetre) and recorded as follows:

242, 228, 217, 209, 253, 239, 266, 242, 251, 240, 223, 219, 246, 260, 258, 225, 234, 230, 249, 245, 254, 243, 235, 231, 257.

- Use the continuous variables above to:
- set up a frequency distribution table
- find the frequency for each class interval
- locate the endpoint for each class interval
- calculate the cumulative frequency by adding the numbers in the
*Frequency*column - record all results in the table

- Use the information gathered from the frequency distribution table to plot a cumulative frequency graph.

- The snow depth measurements range from 209 cm to 266 cm. In order to produce the frequency distribution table, the data are best grouped in class intervals of 10 cm each.
In the

*Snow depth*column, each 10-cm class interval from 200 cm to 270 cm is listed.The

*Frequency*column records the number of observations that fall within a particular interval. This column represents the observations in the*Tally*column, only in numerical form.The

*Endpoint*column functions much like the*Upper value*column of Exercise 1, with the exception that the endpoint is the highest number in the interval, regardless of the actual value of each observation. For example, in the class interval of 210–220, the actual value of the two observations is 217 and 219. But, instead of using 219, the endpoint of 220 is used.The

*Cumulative frequency*column lists the total of each frequency added to its predecessor.Table 2. Snow depth measured at Whistler Mountain, B.C., 25-day period Snow depth ( *x*)Tally Frequency (f) Endpoint Cumulative frequency 200 0 200 to < 210 1 210 1 210 to < 220 2 220 3 220 to < 230 3 230 6 230 to < 240 5 240 11 240 to < 250 7 250 18 250 to < 260 5 260 23 260 to < 270 2 270 25 - Because the variable is continuous, the endpoints of each class interval are used in plotting the graph. The plotted points are joined to form an ogive.
Remember, the cumulative frequency (number of observations made) is labelled on the vertical y-axis and any other variable (snow depth) is labelled on the horizontal x-axis as shown in Figure 2.

The following information can be gained from either graph or table:

- none of the 25 days had snow depth less than 200 cm
- one of the 25 days snow had depth of less than 210 cm
- two of the 25 days snow had depth 260 cm or more

Another calculation that can be obtained using a frequency distribution table is the *relative frequency distribution*. This method is defined as the percentage of observations falling in each class interval. Relative cumulative frequency can be found by dividing the frequency of each interval by the total number of observations. (For more information, see Frequency distribution in the chapter entitled Organizing data.)

A frequency distribution table can also be used to calculate *cumulative percentage*. This method of frequency distribution gives us the percentage of the cumulative frequency, as opposed to the percentage of just the frequency.