Pin it!
Google Plus

Impact of a Superstar

  • Lesson
MeasurementData Analysis and Probability
Samuel E. Zordak
Location: unknown

In this activity, students will use the Illuminations Line of Best Fit Interactive to plot the data from two teams during the 2004‑05 NBA season. In particular, students will look at the data for total points and minutes played by each of the starters on the Los Angeles Lakers and Detroit Pistons. The data suggest that Laker Kobe Bryant is an outlier—he scores more points per minutes than his teammates, which is part of why some sportswriters have described him as "selfish." But through further investigation, students will also notice that Piston Ben Wallace is also an outlier, because he scores fewer points than his teammates.

This lesson gives students an opportunity to identify an outlier within a set of real-life data.

The context for this investigation involves data for the Los Angeles Lakers and Detroit Pistons for the 2004‑05 NBA season. However, you may wish to choose other data sets for use in your classroom. For geographical reasons, you might want to use teams closer to your school. To match the interest of students, you may wish to use data regarding players from a different sport. Or, if your students are generally uninterested in sports, you may wish to use a completely different set of data.

Using the Illuminations Line of Best Fit Interactive, students plot sets of data. Within each of these sets of data, there is an outlier that students can detect in either of two ways:

  1. By visual inspection. A scatterplot of the data will show that some points are not aligned with the others.
  2. By correlation comparison. By removing one player's data at a time, students can determine which player's data has the greatest impact on the correlation coefficient.

Students can do a visual inspection using paper-and-pencil techniques, but repeatedly determining the correlation coefficient on a set of data when points are removed can be quite cumbersome. However, with the use of technology (ex: Excel or graphing calculators), the situation to be explored quickly and the tedium of the calculations is removed. Consequently, it is the second method that will be covered more thoroughly in this lesson.

appicon Line of Best Fit Interactive 

pdficon Impact of a Superstar Activity Sheet 

spreadsheet Team Data Spreadsheet (Excel) 

Begin the lesson by distributing the Impact of a Superstar Activity Sheet. The first page of the activity sheet provides some background information about the teams. The data to be entered into the Line of Best Fit Interactive can be found on the second page of the activity sheet. (In addition, all of the data appears in the Team Data Spreadsheet (Excel), and two columns of data can be pasted from an Excel spreadsheet into the text box in the Line of Best Fit activity, and when the Update Plot button is pressed, the points will appear in the scatterplot.)

Following the directions on the activity sheet, students will first work with the data for the Los Angeles Lakers. They will plot the data and then determine the line of best fit for this data. In particular, they will want to take note of the correlation coefficient (r‑value) for the regression line. Then, one at a time, students will remove one player's data from the set and determine what effect, if any, the removal of that player's data has on the line of best fit and correlation coefficient. [For the Lakers, students will notice that the correlation coefficient is 0.75 when the data for all players is considered. However, when the data for Kobe Bryant is removed, the r‑value increases to 0.95; when the data for any other player is removed, the correlation coefficient either stays the same or decreases. This indicates that the data for Kobe Bryant might be an outlier.]

Questions 1‑6 on the activity sheet take students step-by-step through the process for removing one player's data and considering the effect on the correlation coefficient. In Question 7, however, students are left to conduct a similar experiment on their own using data for the Detroit Pistons. [In this investigation, students will likely notice that Ben Wallace represents something of an outlier. When the data of any other player is removed, the r‑value does not change significantly; but when the data for Ben Wallace is removed, the correlation coefficient increases from r = 0.85 to r = 0.97.]

Assessment Options

  1. Collect student work on the Impact of a Superstar Activity Sheet.
  2. Allow students to analyze another set of data using the Illuminations Line of Best Fit Interactive. Students should be able to identify any outliers and explain how they know.


Allow students to combine the data for the Lakers and the Pistons and consider the complete set. Are either Kobe Bryant or Ben Wallace outliers in this set? Are both of them still outliers? How do you know?

[The correlation coefficient is 0.77 when all players are included. When data for both players are removed, the correlation coefficient increases to 0.96. If either Kobe or Ben are removed, it increases to 0.88 and 0.83, respectively. It could therefore be said, perhaps, that the data for Kobe is more of an outlier than the data for Ben.]


Questions for Students 

1. Does it appear that the data for any player from either team represents an outlier?

[It appears that the data for Kobe Bryant represents an outlier for the Lakers, and the data for Ben Wallace represents an outlier for the Pistons.]

2. Some sportswriters have accused Kobe Bryant of being a selfish basketball player; that is, they say he tries to score more than he tries to help his team. Do the results of this investigation seem to support that accusation? Given that Ben Wallace is also an outlier, could he be accused of being selfish, too?

[The data suggests that Kobe Bryant scores more points per minute than his teammates. However, it is difficult to determine if that is a result of selfishness. Perhaps he is just a better basketball player. On the other hand, Ben Wallace scores fewer points per minute than his teammates, which would suggest that he is not selfish. The reason he scores fewer points is that he concentrates on rebounding and blocking shots more than scoring.]

Teacher Reflection 

  • How did technology help students as they attempted to identify outliers? What things were possible with technology that would have taken longer (or perhaps been impossible) without technology?
  • Were students actively engaged in this lesson? If not, are there other data sets that could be used that would be more interesting to your students?

Learning Objectives

Students will:
  • Use technology tools to plot data, identify lines of best fit, and detect outliers.
  • Compare the lines of best fit when one element is removed from a data set, and interpret the results.

NCTM Standards and Expectations

  • Make decisions about units and scales that are appropriate for problem situations involving measurement.
  • Use unit analysis to check measurement computations.
  • Understand histograms, parallel box plots, and scatterplots and use them to display data.
  • For bivariate measurement data, be able to display a scatterplot, describe its shape, and determine regression coefficients, regression equations, and correlation coefficients using technological tools.
  • Identify trends in bivariate data and find functions that model the data or transform the data so that they can be modeled.

Common Core State Standards – Practice

  • CCSS.Math.Practice.MP5
    Use appropriate tools strategically.
  • CCSS.Math.Practice.MP7
    Look for and make use of structure.