Originally published March 7, 2006
Readers make a number of judgments when reading graphs: they may judge the length of a line, the area of a wedge of a circle, the position of a point along a common scale, the slope of a line, or a number of other attributes of the points, lines, and bars that are plotted. Cleveland and McGill (1984) identified tasks or judgments that are performed when reading graphs and conducted carefully designed experiments to determine which of these judgments we make most accurately. They then designed a graph to take advantage of the knowledge gained from their experimentation. The result was the dot plot. This article introduces the dot plot and offers before and after examples to compare presentations using bar charts and dot plots.
The dot plot in Figure 1 shows the revenues of the top 60 companies from the Fortune 1000 list. Figure 2 shows these same revenues using a bar chart. Most readers would have little problem understanding either the dot plot or the bar chart. Note that the dot plot is less cluttered, less redundant, and uses less ink.
Figure 1: This dot plot shows the revenues of the top 60 companies from the Fortune 1000 list.
Figure 2: The same information shown in Figure 1 is displayed this time in a bar chart.
The Fortune 1000 list also contains the profits of these companies. Figure 3 shows the profits for these 60 companies in the same order as in Figures 1 and 2 to help make comparisons between the charts.
Figure 3: This dot plot shows the profits for these same companies. Note that the companies are ordered by revenue to ease comparisons between charts and that the scale is not the same as used for revenues.
The power of the dot plot becomes evident if we wish to combine the information from Figures 1 or 2 and Figure 3 into a single chart. Both the revenues and the profits are shown in Figure 4. Showing both on the same figure gives an indication of the relative sizes and makes it easier to see those companies whose profits are not consistent with the others. However, the variation in the profits is hard to see in a scale that accommodates revenues. Therefore, Figure 3 is still needed to see this variation. It is often useful to plot the same data several ways. Each emphasizes a different aspect of the data. The presentation in Figure 4 would be much more cluttered and more difficult to interpret with a bar chart. A designer who wanted to show revenues and profits in the same figure might use a clustered bar chart (also called a grouped bar chart.) However, there is no room in Figure 2 to add profits as a second group. It could be done by using a much thinner bar of a different color for profits superposed in the revenue bars, or using transparent bars so that both could be seen. However, that would result in a very busy, cluttered figure. Note that Figure 4 is not at all crowded.
Another advantage of Figure 4 is that it does not depend on color so that it can be used in black and white publications with no loss of clarity. The two groups can be distinguished by using different symbols.
Figure 4: This dot plot superposes the profit data on the same chart as the revenue data. Imagine how cluttered the bar chart would be if we tried to superpose the profit data on it.
We have been concentrating on alternatives to simple bar charts. However, the dot plot is even more powerful when replacing clustered or stacked bar charts since these graphs forms do not communicate quantitative information as well as simple bar charts or dot plots do.
Study Figure 5 and then Figure 6. A number of facts are obvious from either presentation; for example, note that Asians have the highest percent of incomes over both $50,000 and $75,000 in all three counties. There are other facts that I see immediately from the dot plot. Then when I look at the clustered bar chart, I notice these facts that I might not have noticed if not seen first in the dot plot. An example is that whites have the lowest percent of income over $75,000 in Passaic County. As you gain more experience reading dot plots, you will find that they present information much more clearly than do clustered bar charts.
Figure 5: This is a clustered or grouped bar chart showing income data for various ethnic groups in several New Jersey counties. The information comes from the 2000 Census. It is hard to make comparisons across counties when there are so many bars in a group.
Figure 6: This shows the data of Figure 5 in a multi-panel dot chart.
Now assume that we are interested in detailed comparisons of the revenues of the seven companies with the lowest revenues displayed in Figures 1 through 4. Figure 7 shows this information. However, although it is clear that their revenues are all about $30 billion, it is difficult to be more precise.
Figure 7: This shows the revenues of seven of the companies displayed in the first four figures. The bar chart has a zero baseline. We see that the revenues are similar to one another, but it is difficult to get more detail.
Recall that different graph forms require different types of judgments to decode the data. When we look at a bar chart, we may judge position along a common scale by using the horizontal axis to judge the position of the right end of the bars. However, we cannot help also seeing the length of the bars. Figure 8 makes the revenues of Walt Disney appear many times larger than the revenues of Sysco, even though they are both about $30 billion. Figure 8 is a visual lie.
Figure 8: This graph does not use a zero baseline so that detail can be seen. This figure is a visual lie since it makes the revenues of Walt Disney appear many times those of Sysco.
A dot plot is judged by position along the horizontal axis. Length is not an issue with dot plots. Note that in Figure 9 this would not be the case if the gray gridlines ended at the dots instead of continuing across the figure. As a result, there is no distortion in this figure, and we can see the detail needed to compare these companies.
Figure 9: This shows the information of Figure 8 in a dot plot. The points are no longer connected to the baseline so that we are no longer judging length. This figure shows the detail of Figure 8 without the deception.
You probably learned to draw graphs with the independent variable on the horizontal or x axis and the dependent variable on the vertical or y axis. The reverse is true for the plots above. The reason is to make the company names easier to read. Either vertical bar charts or dot plots with the axes reversed would have required the labels to be rotated or drastically abbreviated.
The graphs in this article were drawn using the S language with the exception of Figure 5 that used Excel. S-Plus and R are two implementations of S. S-Plus is commercial software available from Insightful Corporation in Seattle. R is open source software that is freely downloadable from http://www.r-project.org/. S-Plus offers both a graphical user interface and a command line language; R is a command line language. Dot plots can be drawn using Excel even though they do not appear in Excel’s menus. Send an e-mail to Naomi Robbins, email@example.com, for an Excel macro to draw dot plots. However, it is quicker and easier to use software designed to produce these plots once you have mastered the learning curve of S or other software that offers dot plots.
Dot plots can be used in any situation for which bar charts are typically used. They are less cluttered, they make it easier to superpose additional data, and they do not require a zero baseline as do bar charts. See the references for additional discussion and examples of dot plots. Dot plots are a very useful addition to your graphical toolbox.
Cleveland, William S. 1984. “Graphical Methods for Data Presentation: Full Scale Breaks, Dot Charts, and Multibased Logging.” The American Statistician, 38:270-280.
Cleveland, William S. 1993. Visualizing Data. Hobart Press, Summit, NJ.
Cleveland, William S. 1994. The Elements of Graphing Data. Revised edition. Hobart Press, Summit, New Jersey.
Cleveland, William S. and Robert Mc Gill. 1984. “Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods.” Journal of the American Statistical Association 79:531-554.
Robbins, Naomi B. 2005. Creating More Effective Graphs. John Wiley and Sons, Hoboken, NJ.