## Line of Best Fit

6-8, 9-12
Standards:
Math Content:
Data Analysis and Probability

This activity allows the user to enter a set of data, plot the data on a coordinate grid, and determine the equation for a line of best fit.

Plot points by clicking anywhere on the grid, or plot a set of points by entering a pair of coordinates (one pair per line, separated by a comma or a tab) in the text box and clicking Update Plot. (Note that two columns of data can be copied and pasted from a spreadsheet program into this text box; the first column represents the x‑coordinates, and the second column represents the y‑coordinates.)

Check the Remove Points box and click on any point to remove it, or remove the coordinates of that point from the text box. Check the Move Points box and click‑and‑drag any point to change its location, or change the coordinates of that point.

When you check the box for Student Guess, a green line will appear on the grid. Drag the green dots to approximate a line of best fit visually. An equation of this line will appear to the right. When you check the box for Computer Fit ared least-squares regression line will be displayed. An equation of this line and the correlation coefficient (r) will appear to the right.

The grid automatically re-scales as points are added. To see a different portion of the grid, highlight the + key and use the mouse to select a rectangular region on the grid to zoom in on. Click the  key to zoom out and double the amount of grid that is shown. Note: The scaling is not consistent in all views. This is a known issue being worked on.

The data below shows the points scored and minutes played by the six "starters" for the Los Angeles Lakers during the 2004–05 season. (For this investigation, a "starter" is any player who averaged more than 20 minutes per game.)

Plot points scored along the horizontal axis and minutes along the vertical axis. Note that you can copy the data from the "Enter Below" column and paste it into the "Update Plot" text box.

 PLAYER Points Minutes Kobe Bryant 1819 2689 Caron Butler 1195 2746 Chucky Atkins 1115 2903 Lamar Odom 975 2320 Chris Mihm 735 1870 Jumaine Jones 577 1830

 Enter Below 1819,2689 1195,2746 1115,2903 975,2320 735,1870 577,1830

Check the Computer Fit box to see a linear approximation of this data. The correlation coefficient (r) indicates how well the line approximates the data. If |r| = 1, the line is a perfect fit to the data; if |r| = 0, the line does not fit the data at all. In general, the closer |r| is to 1, the better the fit.

• What is the correlation coefficient (r) for this set of data?
• Remove the data for Kobe Bryant. How does this change the regression equation and r value?
• Replace the data for Kobe Bryant, and remove the data for another player. Repeat this process for each player in the list. For which player does the removal of data have the greatest impact on the regression equation and r value? What does the change indicate?
• Can you explain the changes that occurred when data was removed?

You can conduct similar investigations for other sports by looking at the statistics for Major League Baseball (MLB), National Football League (NFL), Women's National Basketball Association (WNBA), Major League Soccer (MLS), or other sports that interest you.