What is the birthday problem? The birthday problem concerns the probability that, in a set of n random people, some pair of them will have the same birthday. Thanks to something called the pigeonhole principle, probability of two people having the same birthday reaches 100% when n (number of people) reaches 366. However, 99% probability is reached with just 57 people, and 50% probability with only 23 people. Remarkably, the applications for this exploit of mathematical probability are all over the place, including in a cryptographic brute force search. Before we get to that, here’s how it all works. Note that in this explanation I’ll be ignoring leap years – that’s just unnecessary confusion.
First, we need to understand how do we approach this problem. It would be mathematically difficult to calculate the probability of there being a shared birthday within a set of people – so we instead can calculate the opposite – the probability of no one sharing a birthday. We can then get the answer to the original question by simply subtracting from 1.
As the probability of a match is roughly 50% at around 23 people, let’s use that for this example.
Each with an equal chance of having a birthday on each day. Let’s start with a single person. What are the odds of that person’s birthday landing on any specific day?
That makes sense so far. They must have a birthday, right? Now – let’s add another person, and figure the odds of that person having a birthday on any day except the last persons birthday.
Things get a little more interesting here. Now, let’s do this for all of our people, our n value, 23 in this case.
This gives us a result of about 0.4927 – almost 0.5 or 50%. But we’re not quite done yet. Remember we switched this problem around – so let’s give a final probability equation.
The birthday problem is fascinating because it is the result of comparing individual probabilities against each other. As each person is added to the room, the chance of their birthday being the same as another increases as each new person is compared to each person that came before them. This results in a trend that looks something like this: