Background Information
A graphing-calculator feature that can be used in conjunction
with the classic birthday paradox is a random-integer generator. With
the proper commands, this feature allows the calculator to create lists
of random birthdays that can serve as data for students to explore.
Students can then run calculator-based simulations of the birthday
paradox and can explore the fact that more than 50 percent of the time,
when groups of "random" strangers are assembled, only twenty-three
persons are needed to find a matching pair of birthdays.
The prime objective in this activity is to demonstrate the
birthday paradox, using it as a springboard into a unit on probability.
By demonstrating a counter intuitive probabilistic event, the goal is to
attract the students' interest while tearing down some of their
preconceived notions.
The difficulty with probability, though, is that unlikely
events still do have a chance of occurring. For example, in the
birthday paradox, a group of twenty-three randomly selected persons
must be selected to have a greater than 50 percent chance that any two
of them share the same birthday (Lesser, 1999). Because some classes
may have just a few students less than that number, the chance of
finding birthday matches would be good but still less than 50 percent
in each class. The chance would be slightly better that none of the
students in a particular class would share a birthday. One might expect
that when the teacher presents the topic and asks students to predict
the size of the group needed to get a 50 percent or better chance of a
match, the responses would include 183, so that more than half the days
of the year would be covered. If we fail to demonstrate a match with
considerably fewer people, students would likely be unimpressed and
their faith in the "obvious" answer would be strengthened. Therefore,
the goal is to create a situation in which the sample would illustrate
the paradox.
By employing a graphing calculator, the activity has the added
benefit of introducing the students to the real-world mathematics
application of Monte Carlo-type simulation techniques. Thus, by
exploring an interesting topic, the students would also see a snapshot
indicating how classroom mathematics connects with the real world.
The graphing calculator's ability to generate random integers
is essential to this activity. It can be used to assemble "random"
strangers so that we could test the reliability of the assertion in the
paradox. At this point, the activity becomes a Monte Carlo simulation.
Monte Carlo simulations often involve using computer-generated random
processes to model various phenomena. They have been used to study
phenomena that range from the weather to the patterns of cracks caused
by stress and fatigue in metal. Generating random values to use as
input for certain variables is vital to these simulations. The activity
offers students a quick method for generating random dates that they
then sort and search for matching pairs. A great deal of data can be
analyzed in a short amount of time, and students can then see that the
data do point toward the assertion in the birthday paradox.
Using the TI-83 to run a Monte Carlo Simulation with the Birthday Paradox
Note: The activities shown below are based upon the TI-83. Other graphing calculators may have different functions.
There is an overhead of the instructions for students here.
The goal is to generate sets of random integers into the lists
in the calculator. Although the store feature can be used, it may be
preferable to enter lists by going directly to the STAT menu and selecting option 1:
Edit, then:
- position the cursor on the heading for list(L1)
- select the MATH menu
- use the arrow to reach the Probability menu (PRB)
- select option 5: randlnt(
- after the prompt randlnt(, enter the following numbers: 1, 12, 23
- close the parentheses and press ENTER.
This procedure generates a list of twenty-three random integers and
uses only the integers from 1 through 12, which serve as the birth
months. At this stage, students do not know why the number 23 is
special to the problem. The instructor gives the students that number
and tells them that the goal of the activity is to determine whether
twenty-three random birthdays are enough to produce at least one
matching pair in 50 percent or more of the trials. The next step is
placing the cursor on the heading for list 2, L2, and entering randlnt(1,31,23)
to create a randomized list of days of the months. The students can
then read lists 1 and 2 as twenty-three random "birthdays" by month and
day. See the figure below.
After lists of random dates have been generated, the next task
is to determine whether the list contains a match. Again, the
calculator rises to the challenge with the SortA() feature of the Stat function. With months stored in L1 and days in L2, students store 100*L1 + L2 into L3 and then execute the command SortA(L3,L1 ,L2). This process first produces a three- or four-digit number that represents each date and rearranges in ascending order lists L1 and L2 correspondingly. Matching pairs will be evident at this point. See the figures below.
Since the goal is finding a match--a successful outcome--or no
matches, this sorting is an ideal feature. Students should be able to
produce several of these simulations; in a classroom of students who
each produced several simulations, a large number of simulations will
be obtained.
After each student has run the simulation several times, the
class collects the number of successful trials and divides this number
by the total number of trials to find an average. It should be about 50
percent, thereby supporting the prediction that twenty-three persons
are sufficient to get a birthday match 50 percent of the time. Students
should be suitably impressed and receptive to learning why only a small
group is needed.
The teacher should emphasize that the lengths of the lists can
be changed. After the students explore whether twenty-three birthdays
are sufficient to get a match 50 percent or more of the time, they can
explore the percent of times that matches occur with groups of
different sizes by running additional trials with a new number. One
recommendation is to use the number of students in the class.
Special attention should be paid to "bogus" data, that is,
dates that do not exist on the calendar. One solution that avoids such
dates is creating one list by randlnt(1,365,n), where n
is the length of the selected list, and reading the numbers as the day
in the year, for example, 35 corresponds to 4 February. Students may
prefer reading the dates directly as month and day, so any batch of
twenty-three dates that includes bogus dates should be omitted and a
new pair of lists generated.
Data collection can be streamlined even further with program 1. In the program, N
represents the size of the group that is being tested for a match.
After the program has run, students should examine the lists of dates
that have been stored in the calculator's lists, check for bogus dates,
and determine whether a match occurred for that set.
An additional virtue of this exploration is that students are
free to predict how many "birthdays" are needed to generate matches and
are then responsible for the experiment that collects the data. They
are in control. The graphing calculator rapidly creates the trials and
allows students a great amount of freedom to explore the situation and
obtain rapid feedback.
Graphical Analysis of the Birthday-Problem Function
When students are intrigued with the birthday paradox and ready
to move into other probability topics, the teacher should formally
present the paradox, without simulations. Students can use a calculator
or computer spreadsheet for number crunching and then look at graphical
representations of the probabilities.
The teacher should first return to the beginning of the
problem and instruct students to calculate the first few probabilities.
The probability that two strangers do not share a birthday is 364/365,
assuming that neither of them was born in a leap year, with the
probability of a match being the complement of this event, that is, 1
-(364/365), or .00274. The probability that three strangers do not
share a birthday is (364/365) × (363/365), or .991796, with the
complement, where a pair of matching birthdays exists, being
1 - (364/365)(363/365) = 0.008204.
Considering a group of four strangers where a matching pair is present produces the probability
1 - (364/365)(363/365)(362/365) = 0.016356
At this stage, the students should generate a general formula that
calculates the probability of a matching pair of birthdays for r random birthdays. The general formula involves permutations and is
P_{match}(r) = 1 - (_{365}P_{r}/365^{r}),
where r is the number of random birthdays. This result can be calculated with handheld calculators for the first few values of r, but larger values of r
may be beyond the capabilities of most calculators. The table feature
of the TI-83, for example, handles the calculation through r = 39, but at r = 40
the calculator fails because of overflow problems. At this point,
students could turn to the computer and generate a spreadsheet, which
can handle the large computations. The spreadsheet can then be used to
calculate the probabilities and create a scatter-plot of the results
(Lesser 1999).
One more application of this problem is generating a table and
graph using the graphing calculator in sequential mode. Again employing
the TI-83 to find the probabilities of getting a match for n people in
a group, students change to sequential mode and enter nMin = 1 and the formula
u(n) = 1 - (1 - u(n - 1))*(366 - n)/365
and
u(nMin) = {0},
then view probabilities in the table. Using the function in a more conventional manner as u(n) = 1 - (365 nPr n)/365^n leads to an overflow in the TI-83 after n = 40, thus showing the need for reworking the function in sequential mode. See the figures below.
Set the window, and graph the function as shown in the figures below.
Summary
The virtues of this activity include using the graphing
calculator for some entertaining and engaging mathematics, which would
have been prohibitively dull and tedious without the technology.
Students have their assumptions challenged and are then free to explore
the problem and ask new questions by creating their own data. Because
this activity has connections with the work that professional
mathematicians do, it could serve as a springboard for Monte Carlo
simulations in other fields of research.