## Exploring Linear Data

• Lesson
6-8,9-12
3

Students model linear data in a variety of settings that range from car repair costs to sports to medicine. Students work to construct scatterplots, interpret data points and trends, and investigate the notion of line of best fit.

Introduction

The first step in exploring linear data is understanding the data. For some relation there is clearly an independent, or operating, variable and a dependent, or response, variable — for example, time and distance. The choice when fitting lines does not always depend on the physical relation between the operating and response variables. It is often more important to know which of the two is to be predicted. If, for example, students investigate the relationship between the temperature and the number of times a cricket chirps in a given period of time, there is a linear relationship. Clearly, the temperature determines to a large extent the rapidity with which crickets chirp, not the other way around. If the students want to predict the temperature by counting chirps, however, the better prediction would come from using the number of chirps as the independent variable and the temp

erature as the dependent variable.

Oil Changes and Engine Repairs

The table below displays data that relate the number or oil changes per year and the cost of engine repairs. The activity which follows uses these data to introduce students to modeling with a linear function. To predict the cost of repairs from the number of oil changes, use the number of oil changes as the x variable and engine-repair cost as the y variable.

When graphing the data, it is important to ask students how the axes should be labeled. After a class discussion, it will be decided that the x‑axis should be labeled "Number of Oil Changes per Year" (with a scale of 0–10), and the y‑axis should be labeled "Engine Repair Cost (in dollars)" (with a scale of 0–700).

Oil Changes and Engine Repair

Use the Line of Best Fit Tool to graph the data for the class. Project the graph onto the overhead projector or computer/television monitor. (Alternatively, students may graph the data using grid paper and pencil.)

The figure below displays the data from the above table graphically. The students are asked to visualize a straight line as a representation of the data. Each student should draw a line that seems to "fit" the plotted points. A line that "fits" the points should have the same characteristics as the set of points; it should actually summarize the data. A line drawn this way is called an eyeball-fit line.

Oil Changes vs. Engine Repair

Once the graph has been drawn, ask students, "What is the significance of the line's downward (or negative) slope?" Students will say it means that the more oil changes made per year, the less money that needs to be spent on engine repairs. Note that the correct wording should be, "There appears to be a relation between the number of oil changes and the money spent on repairs." Although the line represents the data, there is no indication that the number of oil changes causes the need for engine repairs. There are many other variables that affect the cost of engine repairs. More oil changes may just mean those cars have more careful drivers.

Students should examine the slope, which can be found by counting units. First they must define what a unit represents for each variable. In the graph above, one horizontal unit represents one oil change. One vertical unit represents $100 in engine repairs. Since students have drawn different lines, the slopes will vary. Note that whereas a fraction or ratio is easily understood as the slope or rate of change, a decimal representation is more useful to compare slopes. After listing all the slopes reported by the class (perhaps in a stem-and-leaf plot), the class should determine a "consensus" slope. The number should be simple to use. For example, ‑70 is much more useful than ‑73.07 (the result obtained using the Illuminations Line of Best Fit Tool) or ‑68.42 (the result obtained by choosing the two points (0,650) and (9.5,0) from the yellow line on the graph above). Since utility is more highly valued than precision in this instance, a slope of ‑70 seems to be a reasonable value. Students should be able to interpret slope as rate of change. Ask students, "What is the change in the cost of repairs for each oil change?" A rate of change (or slope) of ‑70 indicates that for each additional oil change per year, the cost of engine repairs will tend to decrease by$70. (Students should note that a slope of ‑70 actually represents ‑70/1, which indicates a decrease of $70 per 1 oil change.) Associating measurement units with the slope is more important to give students a concrete basis for understanding. Students should recognize that changing the units on an axis will affect the slope. If the vertical axis were in cents rather than dollars, the slope would be ‑7,000. Next examine the intercepts. The y-intercept is about 650. This means that if there are zero oil changes, engine repairs will cost about$750. From the graph, the x-intercept is about 9.5, which means that a car owner would expect to spend nothing on engine repairs if she changed the oil 9.5 times a year. Is this a sensible number of oil changes per year?

The slope and the y-intercept can be used to write the equation of the line:

Writing y = mx + b for m = ‑70 and b = 650, we get y = ‑70x + 650.

If the y‑intercept is not accessible because of the scale, the equation of the line can be found by using any two points (not necessarily data points) on the line. Students can then use the equation to predict the cost of engine repairs expected for a specific number of oil changes.

For example, if you change your oil four times a year, how much can you expect to pay in engine repairs?

Let x = 4, then y = ‑70(4) + 650 = $370. With four oil changes per year, you can expect to pay about$350 in engine repairs.

Additional points on the graph can be discussed as needed.

Bike Weights and Jump Heights

Distribute the Bike Weights and Jump Heights activity sheet to the students.

 Bike Weights and Jump Heights Activity Sheet

Students may work individually or in pairs to complete the activity sheet. Students may plot the graph by hand, on the grid paper provided, or they may use the Line of Best Fit Tool for their graphs.

Answers to the activity sheet are provided below.

1. Check student graphs. Alternatively, you may display the Bike Weights and Jump Heights Overhead which has another graph of the same data.
Bike Weight vs. Jump Height

2. Negative
3. Decreases
4. -0.150; for every 1‑pound increase in weight, the height decreases slightly less than 2/10 of an inch.
5. A 21.5‑pound bike would be able to jump about 10 inches.

Weights and Drug Doses

Distribute the Weights And Drug Doses activity sheet. Once again, students may work individually or in groups.

 Weights and Drug Doses Activity Sheet

Answers to the activity sheet are provided below.

This problem about prescription medicine illustrates the importance of slope and reinforces the notion of rate of change. You may instruct students to draw median-fit lines, eyeball lines, or regression lines depending on the background of the class.

1. Check student graphs.

Check student graphs.

Weights vs. Drug Doses

Alternatively, you may display the Weights and Drug Doses Overhead which has another graph of the same data.

1. The slope for usual dosage is about 0.45, and for maximum dosage is about 0.76. For every pound of increase in weight, you can increase the usually dosage by 45%, compared to a 76% increase per pound for maximum dosage.
2. The (weight, usual dosage) equation is approximately y = 0.46x. The (weight, maximum dosage) equations is approximately y = 0.76x - 0.05.
3. The lines are not parallel because they have different slopes.
4. y = 1.67x - 0.46. The ratio of the slopes, the change in maximum dosage to weight to the change in usual dosage to weight (0.76 / 0.46), is the slope of the new line. (Weight factors out of the ratio.)

Assessments

Assess students' thinking by their responses to the questions on the activity sheets.

Extensions

1. In the Oil Changes and Engine Repair Activity, ask students, "What is the difference between data points above the line and those below the line?" Points above the line would indicate that the actual engine repairs exceeded the amount predicted by the number of oil changes. Points below the line represent situations where the engine repairs cost less than predicted. Because the line is only a summary of the relation, just as the mean or median is a summary for a single set of data, there is a degree of variation in using the line to predict.
2. Additional discussion could explore possible reasons, in addition to the natural variation, for this deviation from the line. Excessive engine repairs could be due to bad driving habits; lower than usual temperatures, etc.
3. Distribute the Winning Times activity sheet. Students can interpolate by predicting what might have occurred during the war years as well as look at the danger in extrapolating. The 1988 time was 4:13. The slope is meaningful but might not remain consistent as the curve levels off. The domain and range do not include the intercepts.
 Winning Times Activity Sheet
none

### Learning Objectives

Students will be able to:

• Construct scatterplots of two-variable data
• Interpret individual data points and make conclusions about trends in data, especially linear relationships
• Estimate and write equations of lines of best fit

### Common Core State Standards – Mathematics

• CCSS.Math.Content.8.SP.A.1
Construct and interpret scatter plots for bivariate measurement data to investigate patterns of association between two quantities. Describe patterns such as clustering, outliers, positive or negative association, linear association, and nonlinear association.

• CCSS.Math.Content.8.SP.A.2
Know that straight lines are widely used to model relationships between two quantitative variables. For scatter plots that suggest a linear association, informally fit a straight line, and informally assess the model fit by judging the closeness of the data points to the line.