Morrison on Metrics: Grouped thoughts on scattergrams

The finer points of plotting graphs and lines

When you create a scattergram, the most common additional step of analysis and visualization is to superimpose on it a trend-line. If you use the default trend-line function of Excel, the line will be straight (known as the least squares line), and it will generate an equation for the line. A straight line, however, may convey a distorted sense of the trend of the points on a scattergram that has some unusually high or low numbers or anomalies in the middle of fairly consistent numbers. True, you can selectively delete the outliers, but doing so leaves you vulnerable to charges of data manipulation.

Alternatively you would rather have the plot of points reveal its pattern with a line that more snugly fits the points. Called a smoothing function, various programs can create such a smooth line—one that may have bumps and squiggles in it. One advantage of using a smooth, rather than a more traditional linear fit, is that smoothed lines are local. The effects of some outlier points on a smooth fit affect only those parts that fit near those points with a linear method, whereas outliers distort the entire straight line. The slope—the equation generated—looks at all the data points as equally influential.

author image

Rees Morrison

Rees Morrison, Esq. is the founder of General Counsel Metrics, LLC.

Bio and more articles

