Illuminations: The Regression Line and Correlation

The Regression Line and Correlation


Correlation and the Regression Line

Interactive computer-based tools provide students with the opportunity to easily investigate the relationship between a set of data points and a curve used to fit the data points. As students work with bivariate data in grades 9-12, they will be able to investigate relationships between the variables using linear, exponential, power, logarithmic, and other functions for curve fitting (See Related 9-12 Data Analysis & Probability Standard). Using interactive tools like the one below, students can investigate the properties of regression lines and correlation.

Learning Objectives

 

Students will

  • learn about Pearson's correlation coefficient: the measure of the linear association between the horizontal variable and the vertical variable

Materials

 
  • Computer and Internet connection

Instructional Plan

An important question that comes up in determining a curve to fit our data points is: How scattered can the points be and still have a shape that can be represented by a curve? The idea of correlation helps to measure this. When you click "Show Line" in the interactive applet, the value r, which appears in the top left section of the applet, is Pearson's correlation coefficient. It is a measure of the linear association between the horizontal variable and the vertical variable. It gives information about how tightly packed the data points are about the regression line. It thereby also gives information about how well the regression line fits the data. The r-values can range from -1 (strong negative linear association) to 0 (no linear association) to +1 (strong positive linear association). But beware! You will see below that the correlation coefficient, r, is sometimes misleading. You should always look at the scatterplot and combine that knowledge with the r-value in order to draw valid conclusions about the strength of the linear association. Go to Questions.



Explore the Relationship Between Correlation
and Linear Association

Use the interactive math applet below to help you answer these questions:

1. Compare the r-values for the following three situations.
• Create a scatterplot that you think shows a strong positive linear association between the two variables. What is the r-value
• Create a scatterplot that you think shows a strong negative linear association between the two variables. What is the r-value
• Create a scatterplot that you think shows no linear association between the two variables. What is the r-value?
 
2. For each r-value below, create a scatterplot that has that exact r-value.
r = 1
r = -1
r =0
 
3. Plot several points that exhibit a strong positive linear trend, and then plot one outlier.
• Overall, is this scatterplot roughly linear?
• Is the r-value close to 1?
 
4. In the lower left corner of the coordinate plane, plot 10 points that exhibit no trend (this is sometimes called a "cloud" of points). Then plot one point in the upper right corner.
• Overall, is this scatterplot linear?
• Is the r-value close to 1?

5. Does a high r-value necessarily mean that the data are generally linear? Does an r-value close to zero always mean that the data are not linear?

The moral is that the correlation coefficient, r, is a valuable tool for studying the linear association between two variables, but it does not fully explain the association (in fact, no statistic does).

Notice: use the link below to go the the applet, rather than scrolling down.
Go to the Regression Line Applet


 



Answers

NCTM Standards and Expectations

 
Data Analysis & Probability 9-12
  1. Display and discuss bivariate data where at least one variable is categorical.
  2. Recognize how linear transformations of univariate data affect shape, center, and spread.
  3. For bivariate measurement data, be able to display a scatterplot, describe its shape, and determine regression coefficients, regression equations, and correlation coefficients using technological tools.

References

 
  • NSFCopyright Notice: Applet generously provided by: L. O. Cannon, James Dorward, E. Robert Heal, Richard Wellman (Utah State University, www.matti.usu.edu). The USU MATTI project is supported by the National Science Foundation (Award #9819107). Copyright 1999.
  
1 period   

NCTM Resources

Principles and Standards for School Mathematics

 Activities


National Council of Teachers of Mathematics Thinkfinity Verizon Foundation
© 2000 National Council of Teachers of Mathematics
Use of this Web site constitutes acceptance of the Terms of Use