Statistical, or pseudo-statistical are everywhere around us. Many of the media articles quoting statistical or survey data are accompanied by graphs, and statistical graphs are also essential part of scientific and business reports.
But are all statistical graphs useful? How to rate a graph one sees and how to know that a graph one made is useful to others?
A good graph makes it easier for the reader to understand the content and interpret the meaning of the information, while at the same time staying faithful to the data. Bad graphs are either unclear and hard to decipher, or distort and manipulate the data.
Graphs should be clear and not require extra effort to decode: Is it clear what the axes and data series represent? Is there enough information to be clear but not too much so the graph is not too busy or cluttered? Are colours and patterns used easy to tell apart? There should be a legend for anything that has more than one series, and axes should be clearly described with what they represent and units of measurement.
Graphs should be of the type and style suited to the character of the data and commonly used . For example, figures in a pie chart have to add to 100%. Time is usually represented on the horizontal axis and moving it onto the vertical one will make people struggle.
Points of crossing of the axes should be clear: it’s normally assumed to be at zero, and if it’s not it should be made clear what the intersection point is.
The distances on the graph should be proportional to the distances in data. If there is a need to “condense” the data, a clearly indicated log scale should be used, not a manipulation that means that an inch at some point of the scale means twice as much as an inch at another point.
Any differences depicted, and particularly, any differences emphasised in the graph should be statistically significant. Scientific papers often present graphs regardless of whether the results obtained in the experiment were statistically significant, but this should be avoided in graphs for popular science books or media use.
Differences should not be emphasised unduly and by distorting the scales or picking a chart format that would do that.
All relevant data should be presented in the graph, but no irrelevant data should be shown. A good graph is a balanced trade off between relevance and completeness.
Graphs with two Y axes should be avoided, unless there is a clear relationship between two variables represented on those axes and unless this relationship would not be prone to distortion by manipulating the scales used.