Illuminations: Least Squares Regression

Least Squares Regression


Automobile Mileage: Year vs. Mileage

In this lesson, students plot data about automobile mileage and interpret the meaning of the slope and y-intercept in the resulting equation for the least squares regression line. By examining the graphical representation of the data, students analyze the meaning of the slope and y-intercept of the line and interpret them in the context of the real-life application. Students also make decisions about the age and mileage of automobiles based on the equation of the least squares regression line.

Learning Objectives

 

Students will

  • interpret slope as a rate of change in the context of real-life data
  • interpret the y-intercept of a line in the context of real-life data
  • make predictions based on the equation or graph of the least squares regression line
  • explain the difference between actual values and predicted values

Materials

 

Instructional Plan

To introduce this lesson, the class should engage in a short review of the meaning of slope, as a rate of change, and interpreting the y-intercept of a line in the context of a real-life application. The teacher should refer to specific examples used earlier in this Unit Plan. The meaning of the correlation coefficient should also be reviewed. Guiding Questions to help you with this discussion follow.

  1. How does slope relate to the graph of a line?
  2. How does slope represent a rate of change?
  3. How do the labels on the axes of a graph help you determine the rate of change?
  4. What does the y-intercept indicate about the data on a graph?
  5. How do the labels on the axes of a graph help determine the meaning of the y-
    intercept in the context of a real-life application?
  6. What does the correlation coefficient indicate about the least squares regression line?
  7. Why are some correlation coefficients positive and others negative?
  8. What can we conclude about correlation coefficients with very small values (close to zero)?
  9. What does the sign of the correlation coefficient indicate about the slope of the least squares regression line?

Today’s activity should begin with a general discussion of what students know about the year in which an automobile was produced and the mileage they would expect to be on the automobile. The teacher should consistently refer to the year in which the car was produced and not the age of the automobile.

Students should be divided into teams of two to work at the computer and given a copy of the handout Automobile Mileage—Year vs. Mileage (or a similar handout of student-produced data). They should visit the Web site: http://illuminations.nctm.org/index_d.aspx?id=454.

Working together, the partners can share the responsibility of making sure the data is plotted correctly. One student should plot the data, while the second reads out the data and makes sure it is plotted correctly. This is a good time to bring independent and dependent variables. Explain to students that x is the independent variable and y is the dependent variable and ask them to differentiate which is which for our year vs. mileage situation. Are students clear on the fact that the year is associated with the x-axis and is the independent variable and the mileage is on the y-axis and is the dependent variable?

Students should click on the applet and make the changes in the viewing window indicated on the handout. As the students begin to plot the data, the teacher should walk from group to group, making sure the students are plotting the data correctly. Encourage students to think about the data they are plotting and the resulting plots. Ask questions such as the following as you monitor and facilitate the group work:

  • Are you beginning to notice any pattern in the shape of the plot?
  • What type of function do you think will fit this data?
  • Do you think the slope of the line will be positive or negative?
  • What do you think the y-intercept of the regression line might be?

Allow the students to complete the plot and answer the questions on the handout. Continue to circulate and facilitate discourse between the partners.

After completing the questions on the handout, students should be given the opportunity to discuss their findings as a class. The questions on the handout can be used to help guide this discussion. This will encourage students to reflect on what they have discovered about the graphical and algebraic estimations of the real data and allow them to strengthen their understanding of slope as a rate of change. The teacher should pay particular attention to the students understanding of the units used on the axes.

Assessment Options

 

The discussion will give both the teacher and students an opportunity to assess the students’ understanding of the lesson. At this stage of the activity, it is important to know if students can:

  • Correctly plot data points on the applet
  • Determine if a least squares regression line is actually a good fit for the data being graphed
  • Interpret slope as a rate of change
  • Explain a negative rate of change
  • Interpret the meaning of the y-intercept
  • Understand the meaning of the correlation coefficient
  • Use the units on the axes to help interpret the meaning of the slope and y-intercept
  • Explain the difference between actual values and predicted values

The Guiding Questions and the questions on the handout help students focus on the mathematics and aid you in determining the students’ level of understanding of the mathematical concepts in this lesson. If you began using Status of the Class to record students’ understanding of Lesson One of this Unit Plan, continue to document student understanding and progress. Documenting information about students’ understanding throughout the lesson(s) can help you focus on the needs and strengths of individual students, and thus can increase student learning opportunities. If students need more practice with actual and predicted values, look at some data sets further. Have them make more predictions and compare their answers to those of their classmates.

Extensions

 

The following questions can be used to help students relate what they have done during this lesson to other topics about linear functions.

  1. For what values of x did your equation seem appropriate? Why?
  2. What values of y do you think would be appropriate for the “real-life data” used in this
    example? Why?
  3. What would be an appropriate domain for your graph?
  4. What would be an appropriate range for your graph?
  5. What was the independent variable in this application?
  6. What was the dependent variable in this application?

Teacher Reflection

 
  1. Which students met all the objectives of this lesson? What extension activities are appropriate for those students?
  2. Which students did not meet the objectives of this lesson? What instructional experiences do they need next? What mathematical ideas need clarification?
  3. What adjustments would you make the next time you teach this lesson?

NCTM Standards and Expectations

 
Data Analysis & Probability 9-12
  1. Evaluate published reports that are based on data by examining the design of the study, the appropriateness of the data analysis, and the validity of conclusions
  2. Understand how basic statistical techniques are used to monitor process characteristics in the workplace.
  
1 period   

NCTM Resources

Principles and Standards for School Mathematics

Web Sites


National Council of Teachers of Mathematics Thinkfinity Verizon Foundation
© 2000 National Council of Teachers of Mathematics
Use of this Web site constitutes acceptance of the Terms of Use