A scatter diagram or scattergram is used to display
the results when two sets of data are compared to see if there is
a relationship between them.
Some examples:
|
The heights of people against their weight.
|
The size of icebergs and their distance from
the South Pole.
|
The age of kiwifruit plants against the number
of kiwifruit produced by the plants.
|
 |
|
|
See if you can work out which value goes on which axis. Does it matter?
If all of the points lie on or near a straight line there is said
to be a linear correlation between the two sets of data.
The first graph above has a positive
linear correlation, as one quantity increases so does the other.
The second graph above has a negative
linear correlation, as one quantity increases the other decreases.
The line that best fits the points is called the trend line,
the line of best fit or the regression line.
There are mathematical methods for drawing this line but a simple
approximate method is to have half of the points above the line and
half below the line.
 |
Graphical calculators, such as the Casio
CFX-9850G, shown in the picture, can plot a scatter diagram,
draw the regression line, give its equation and calculate the
correlation coefficient, r.
The closer r is to 1 or -1 the
better the fit of the line to the data and the stronger the
relationship between the two sets of data.
The formal study of correlation and regression
is outside the scope of the Bursary Statistics course.
|
Plotting a Scatter Diagram
Suppose a teacher wants to see if there is a connection between the
exam marks of students studying both mathematics and physics. The
table shows the results of 20 students.
| Student number |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 |
20 |
| Mathematics mark |
45 |
67 |
93 |
45 |
56 |
67 |
68 |
34 |
54 |
89 |
59 |
60 |
43 |
90 |
41 |
30 |
56 |
76 |
89 |
65 |
| Physics marks |
56 |
69 |
89 |
39 |
52 |
61 |
69 |
43 |
59 |
94 |
60 |
52 |
41 |
84 |
41 |
39 |
60 |
73 |
92 |
62 |
These results could be put into two columns of a spreadsheet and
the following scatter diagram would result.

From the graph there would appear to be a positive linear
correlation. This means that from the data it looks like there is
a relationship between the two sets of marks. A student scoring well
in mathematics is also likely to score well in physics.
Note that on a scatter diagram it does not matter which
variable is placed on a particular axis.
|
Care! must be taken when
reaching conclusions from scatter diagrams. Reasons other than
mathematical ones, may need to be considered. For example, there
may be a mathematical correlation between the number of road
deaths and the number of four wheel drive vehicles on the road,
but it would be dangerous to say that there is a connection.
Is the recent fall in the number of road fatalities because
of the increase in the number of four wheel drive vehicles?
|
 |
Samples of Scatter Diagrams
To see some samples of scatter diagrams from real data, click on
the link below.
These scatter diagrams show the line of best fit and include
some features outside the Bursary Mathematics with Statistics course.
You will also be able to enter your own data and the line of best
fit will be drawn for you.
This may take a few seconds. Be patient, it will be worth it! - 