### A Generalization of the Birthday Problem

#### by David Radcliffe

In a group of *N* people, chosen at random, what is the probability that two or more share the same birthday? We assume that birthdays are distributed equally among the 365 days of the year, ignoring leap days. This question is known as the Birthday Problem

The probability is higher than one might expect. In a group of 23 people, there is a 51% chance that two or more people have the same birthday. If there are 40 people, the probability increases to 89%.

There are many sites that explain the Birthday Problem, so I won’t discuss the details. An excellent non-technical explanation was given by Steven Strogatz in the article It’s My Birthday Too, Yeah, which was published in the New York Times. A more advanced treatment can be found in Wikipedia and Mathworld.

Instead, I would like to challenge the reader to solve a more difficult problem. *In a group of 500 people, what is the probability that six or more have the same birthday?*

My answer is 65.4%, which I computed using the attached Python code. However, I am not absolutely certain that this answer is correct, because the calculations are subject to rounding error, so I am hoping that someone will verify this result independently.

**Python code: ** birthday.py

Here’s a one-liner in R to simulate a million samples of size 500 from {1, 2, …, 365} (of course with replacement), for each sample find the maximum number of people sharing a birthday in that sample, and create a table of the results:

table(replicate(10^6, max(table(sample(365, 500, replace=TRUE)))))

The table I get is as follows. For example, there was one sample in which 13 of 500 people had the same birthday.

4: 5773

5: 339786

6: 475991

7: 146220

8: 27489

9: 4129

10: 541

11: 60

12: 10

13; 1

So I observe 654,441 samples which have six or more birthdays on the same day — which is in line with your result.

Thanks Michael!

[…] David Radcliffe asks about a generalization of the birthday problem. […]