Pin it!
Google Plus

Birthday Paradox

  • Lesson
6-8,9-12
1
Data Analysis and Probability
Unknown
Location: Unknown

This activity demonstrates the Birthday Paradox, using it as a springboard into a unit on probability. Students use a graphing calculator to run a Monte Carlo simulation with the birthday paradox and perform a graphical analysis of the birthday-problem function. This lesson was adapted from an article, written by Matthew Whitney, which appeared in the April 2001 edition of Mathematics Teacher.

Background Information 

A graphing-calculator feature that can be used in conjunction with the classic birthday paradox is a random-integer generator. With the proper commands, this feature allows the calculator to create lists of random birthdays that can serve as data for students to explore. Students can then run calculator-based simulations of the birthday paradox and can explore the fact that more than 50 percent of the time, when groups of "random" strangers are assembled, only twenty-three persons are needed to find a matching pair of birthdays.

The prime objective in this activity is to demonstrate the birthday paradox, using it as a springboard into a unit on probability. By demonstrating a counter intuitive probabilistic event, the goal is to attract the students' interest while tearing down some of their preconceived notions.

The difficulty with probability, though, is that unlikely events still do have a chance of occurring. For example, in the birthday paradox, a group of twenty-three randomly selected persons must be selected to have a greater than 50 percent chance that any two of them share the same birthday (Lesser, 1999). Because some classes may have just a few students less than that number, the chance of finding birthday matches would be good but still less than 50 percent in each class. The chance would be slightly better that none of the students in a particular class would share a birthday. One might expect that when the teacher presents the topic and asks students to predict the size of the group needed to get a 50 percent or better chance of a match, the responses would include 183, so that more than half the days of the year would be covered. If we fail to demonstrate a match with considerably fewer people, students would likely be unimpressed and their faith in the "obvious" answer would be strengthened. Therefore, the goal is to create a situation in which the sample would illustrate the paradox.

By employing a graphing calculator, the activity has the added benefit of introducing the students to the real-world mathematics application of Monte Carlo-type simulation techniques. Thus, by exploring an interesting topic, the students would also see a snapshot indicating how classroom mathematics connects with the real world.

The graphing calculator's ability to generate random integers is essential to this activity. It can be used to assemble "random" strangers so that we could test the reliability of the assertion in the paradox. At this point, the activity becomes a Monte Carlo simulation. Monte Carlo simulations often involve using computer-generated random processes to model various phenomena. They have been used to study phenomena that range from the weather to the patterns of cracks caused by stress and fatigue in metal. Generating random values to use as input for certain variables is vital to these simulations. The activity offers students a quick method for generating random dates that they then sort and search for matching pairs. A great deal of data can be analyzed in a short amount of time, and students can then see that the data do point toward the assertion in the birthday paradox.

Using the TI-83 to run a Monte Carlo Simulation with the Birthday Paradox 

Note: The activities shown below are based upon the TI-83. Other graphing calculators may have different functions. 

pdficonBirthday Paradox TI-83/TI-84 Plus Instructions 

The goal is to generate sets of random integers into the lists in the calculator. Although the store feature can be used, it may be preferable to enter lists by going directly to the  STAT  menu and selecting option 1: Edit, then:

  1. position the cursor on the heading for list(L1)
  2. select the MATH menu
  3. use the arrow to reach the Probability menu (PRB)
  4. select option 5: randlnt(
  5. after the prompt randlnt(, enter the following numbers: 1, 12, 23
  6. close the parentheses and press ENTER.

This procedure generates a list of twenty-three random integers and uses only the integers from 1 through 12, which serve as the birth months. At this stage, students do not know why the number 23 is special to the problem. The instructor gives the students that number and tells them that the goal of the activity is to determine whether twenty-three random birthdays are enough to produce at least one matching pair in 50 percent or more of the trials. The next step is placing the cursor on the heading for list 2, L2, and entering randlnt(1,31,23) to create a randomized list of days of the months. The students can then read lists 1 and 2 as twenty-three random "birthdays" by month and day. See the figure below.

1198 birthday1    

After lists of random dates have been generated, the next task is to determine whether the list contains a match. Again, the calculator rises to the challenge with the SortA() feature of the Stat function. With months stored in L1 and days in L2, students store 100*L1 + L2 into L3 and then execute the command SortA(L3,L1 ,L2). This process first produces a three- or four-digit number that represents each date and rearranges in ascending order lists L1 and L2 correspondingly. Matching pairs will be evident at this point. See the figures below.


1198 birthday 2a1198 birthday 2b 

1198 birthday 2c 

Since the goal is finding a match--a successful outcome--or no matches, this sorting is an ideal feature. Students should be able to produce several of these simulations; in a classroom of students who each produced several simulations, a large number of simulations will be obtained.

After each student has run the simulation several times, the class collects the number of successful trials and divides this number by the total number of trials to find an average. It should be about 50 percent, thereby supporting the prediction that twenty-three persons are sufficient to get a birthday match 50 percent of the time. Students should be suitably impressed and receptive to learning why only a small group is needed.

The teacher should emphasize that the lengths of the lists can be changed. After the students explore whether twenty-three birthdays are sufficient to get a match 50 percent or more of the time, they can explore the percent of times that matches occur with groups of different sizes by running additional trials with a new number. One recommendation is to use the number of students in the class.

Special attention should be paid to "bogus" data, that is, dates that do not exist on the calendar. One solution that avoids such dates is creating one list by randlnt(1,365,n), where n is the length of the selected list, and reading the numbers as the day in the year, for example, 35 corresponds to 4 February. Students may prefer reading the dates directly as month and day, so any batch of twenty-three dates that includes bogus dates should be omitted and a new pair of lists generated.

Data collection can be streamlined even further with program 1. In the program, N represents the size of the group that is being tested for a match. After the program has run, students should examine the lists of dates that have been stored in the calculator's lists, check for bogus dates, and determine whether a match occurred for that set.

An additional virtue of this exploration is that students are free to predict how many "birthdays" are needed to generate matches and are then responsible for the experiment that collects the data. They are in control. The graphing calculator rapidly creates the trials and allows students a great amount of freedom to explore the situation and obtain rapid feedback.

Graphical Analysis of the Birthday-Problem Function 

When students are intrigued with the birthday paradox and ready to move into other probability topics, the teacher should formally present the paradox, without simulations. Students can use a calculator or computer spreadsheet for number crunching and then look at graphical representations of the probabilities.

The teacher should first return to the beginning of the problem and instruct students to calculate the first few probabilities. The probability that two strangers do not share a birthday is 364/365, assuming that neither of them was born in a leap year, with the probability of a match being the complement of this event, that is, 1 - \frac{{364}}{{365}}, or 0.00274. The probability that three strangers do not share a birthday is \frac{{364}}{{365}} \times \frac{{363}}{{365}}, or 0.991796, with the complement, where a pair of matching birthdays exists, being 

1 - \frac{{364}}{{365}} \times \frac{{363}}{{365}} = 0.008204 

Considering a group of four strangers where a matching pair is present produces the probability

1 - \frac{{364}}{{365}} \times \frac{{363}}{{365}} \times \frac{{362}}{{365}} = 0.016356 

At this stage, the students should generate a general formula that calculates the probability of a matching pair of birthdays for r random birthdays. The general formula involves permutations and is

{P_{match}}(r) = 1 - \frac{{_{365}{P_r}}}{{{{365}^r}}}

where r is the number of random birthdays. This result can be calculated with handheld calculators for the first few values of r, but larger values of r may be beyond the capabilities of most calculators. The table feature of the TI-83, for example, handles the calculation through r = 39, but at r = 40 the calculator fails because of overflow problems. At this point, students could turn to the computer and generate a spreadsheet, which can handle the large computations. The spreadsheet can then be used to calculate the probabilities and create a scatter-plot of the results (Lesser 1999).

One more application of this problem is generating a table and graph using the graphing calculator in sequential mode. Again employing the TI-83 to find the probabilities of getting a match for n people in a group, students change to sequential mode and enter nMin = 1 and the formula

u(n) = 1 - (1 - u(n - 1))(\frac{{366 - n}}{{365}}) 

and

u(nMin) = \{ 0\}  

then view probabilities in the table. Using the function in a more conventional manner as u(n) = 1 - (365 nPr n)/365^n leads to an overflow in the TI-83 after n = 40, thus showing the need for reworking the function in sequential mode. See the figures below.

 1198 figure2a1198 figure2b 

 

1198 birthday 3c 

Set the window, and graph the function as shown in the figures below.

1198 figure4a1198 figure4b

1198 birthday 4c 

Assessment 

  1. Have students identify different situations in which the sample would illustrate the paradox.
  2. Have students choose a situation and apply the Monte Carlo simulation to too illustrate the paradox. 

Extensions 

1.   

TI-83 Program 1 

PROGRAM
Name = BIRTHDAY:CirAllLists
:Disp "NUMBER OF"
:Disp "PEOPLE?"
:Input N
:randint (1, 12, N) arrow right L1
:randint (1, 31, N) arrow right L2
:100[sup *]L1 + L2 arrow right L3
:SortA(L3, L2, L1)
 2. Have students collect data on a sample size double the original size and compare the results. 

 

Question for Students 

How would the results change if we doubled the amount of the sample size?

[The chances of there being a match would increase.]

Teacher Reflection 

  • Was the level of students enthusiasm towards probability higher or lower after the lesson?  Explain why.
  • What, if any, issues arose with classroom management? How did you correct them? If you use this lesson in the future, what could you do to prevent these problems?
  • Were students able to handle the mathematical content contained in the lesson? If not, what could be done to make it more understandable?
  • How were you able to challenge the high achievers? How did you extend the investigation to keep them interested?

Learning Objectives

Students will:

  • Apply basic concepts of probability.
  • Use simulations to generate and explore data.
  • Develop an understanding of permutations as a counting technique.
  • Generalize patterns using explicitly defined and recursively defined functions.
  • Use varied representations to model and interpret mathematical phenomena. 

Common Core State Standards – Mathematics

Grade 7, Stats & Probability

  • CCSS.Math.Content.7.SP.C.5
    Understand that the probability of a chance event is a number between 0 and 1 that expresses the likelihood of the event occurring. Larger numbers indicate greater likelihood. A probability near 0 indicates an unlikely event, a probability around 1/2 indicates an event that is neither unlikely nor likely, and a probability near 1 indicates a likely event.

Grade 6, Stats & Probability

  • CCSS.Math.Content.6.SP.A.1
    Recognize a statistical question as one that anticipates variability in the data related to the question and accounts for it in the answers. For example, ''How old am I?'' is not a statistical question, but ''How old are the students in my school?'' is a statistical question because one anticipates variability in students' ages.

Grade 6, Stats & Probability

  • CCSS.Math.Content.6.SP.A.2
    Understand that a set of data collected to answer a statistical question has a distribution which can be described by its center, spread, and overall shape.

Common Core State Standards – Practice

  • CCSS.Math.Practice.MP4
    Model with mathematics.
  • CCSS.Math.Practice.MP5
    Use appropriate tools strategically.