5 Data Visualization
5.7 Histogram

Text begins

The histogram is a popular graphing tool. It is used to summarize discrete or continuous data that are measured on an interval scale. It is often used to illustrate the major features of the distribution of the data in a convenient form. It is also useful when dealing with large data sets (greater than 100 observations). It can help detect any unusual observations (outliers) or any gaps in the data.

A histogram divides up the range of possible values in a data set into classes or groups. For each group, a rectangle is constructed with a base length equal to the range of values in that specific group and a length equal to the number of observations falling into that group. A histogram has an appearance similar to a vertical bar chart, but there are no gaps between the bars. Generally, a histogram will have bars of equal width. Chart 5.7.1 is an example of a histogram that shows the distribution of salary, a continuous variable, of the employees of a corporation.

Chart 5.7.1 Distribution of salaries of the employees of ABC Corporation

Data table for Chart 5.7.1 
Data table for Chart 5.7.1
Table summary
This table displays the results of Data table for Chart 5.7.1. The information is grouped by Salary (in thousands of $) (appearing as row headers), Number of employees (appearing as column headers).
Salary (in thousands of $) Number of employees
0–10 50
11–20 300
21–30 250
31–40 400
41–50 550
51–60 433
61–70 266
71–80 350
81–90 100
91+ 20

The following table presents the differences between a histogram and vertical bar graph.


Table 5.7.1
Differences between bar chart and histogram
Table summary
This table displays the results of Differences between bar chart and histogram. The information is grouped by Comparison terms (appearing as row headers), Bar chart and Histogram (appearing as column headers).
Comparison terms Bar chart Histogram
Usage To compare different categories of data. To display the distribution of a variable.
Type of variable Categorical variables Numeric variables
Rendering Each data point is rendered as a separate bar. The data points are grouped and rendered based on the bin value. The entire range of data values is divided into a series of non-overlapping intervals.
Space between bars Can have space. No space.
Reordering bars Can be reordered. Cannot be reordered.

Date modified: