Illuminations: The Regression Line and Correlation

The Regression Line and Correlation


The Effects of Outliers

Interactive computer-based tools provide students with the opportunity to easily investigate the relationship between a set of data points and a curve used to fit the data points. As students work with bivariate data in grades 9-12, they will be able to investigate relationships between the variables using linear, exponential, power, logarithmic, and other functions for curve fitting (See Related 9-12 Data Analysis & Probability Standard). Using interactive tools like the one below, students can investigate the properties of regression lines and correlation.

Learning Objectives

 

Students will

  • investigate the effect of outliers on a regression line and easily see their significance

Materials

 
  • Computer and Internet connection

Instructional Plan

It is important for students to develop facility at recognizing outliers and appreciating their effect on regression curves and residuals. (See Related 9-12 Data Analysis & Probability Standard. Also, to further investigate the least squares regression line and residuals, see the the regression line E-example.) Using interactive tools, students can investigate the effect of outliers on a regression line and easily see their significance.

In this section, you will see that one point in a data set that is far away from all the others can change the line dramatically. Go to Questions.

Investigating the Effects of Outliers
on the Regression Line

1. CLEAR the graph. Plot about 8 points that seem to be approximately on a line that has positive slope (slanted up as you move left to right). Click on SHOW GRAPH to see the regression line.
 
• Does the line fit the points well?
 
• Does the equation show a positive slope?
 
 
2. Add an outlier point, that is, a point that does not at all follow the trend established by the other points.
• Describe what happens to the regression line. Explain.
 
• Grab this outlier and drag it around. Observe how the regression line changes. Describe any patterns that you see.
 
3. As you have seen, an outlier can significantly affect the regression line. CLEAR the graph and begin again with about 8 points that seem to be approximately on a line that has positive slope.
 
• Experiment with dragging an outlier point to find locations of an outlier that cause the regression line to drastically change slope.
 
• Find other locations of an outlier that cause the regression line to shift without changing slope.

Notice: use the link below to go the the applet, rather than scrolling down.
Go to the Regression Line Applet


 



Reflection Questions

  • In the previous part you were asked to think of a real-world example where you would find a line of best fit. In that real situation, what would be an outlier?
  1. How would you summarize the effect of outliers on the regression line?
  2. Think back to the example in part 1 of using the regression equation to predict a person's weight when you know their height. What would be an outlier in this case? Could you justify leaving out that point and using just the remaining points to calculate the regression equation?

Answers

NCTM Standards and Expectations

 
Data Analysis & Probability 9-12
  1. For bivariate measurement data, be able to display a scatterplot, describe its shape, and determine regression coefficients, regression equations, and correlation coefficients using technological tools.
  2. Display and discuss bivariate data where at least one variable is categorical.
  3. Recognize how linear transformations of univariate data affect shape, center, and spread.

References

 
  • NSF Copyright Notice: Applet generously provided by: L. O. Cannon, James Dorward, E. Robert Heal, Richard Wellman (Utah State University, www.matti.usu.edu). The USU MATTI project is supported by the National Science Foundation (Award #9819107). Copyright 1999.
  
1 period   

NCTM Resources

Principles and Standards for School Mathematics

 Activities


National Council of Teachers of Mathematics Thinkfinity Verizon Foundation
© 2000 National Council of Teachers of Mathematics
Use of this Web site constitutes acceptance of the Terms of Use