All about flooble | fun stuff | Get a free chatterbox | Free JavaScript | Avatars    
perplexus dot info

Home > Probability
Common Birth Dates II (Posted on 2021-11-17) Difficulty: 3 of 5
Common Birth Dates asked for the expected number of pairs of common birth dates for a cohort of 1000 people whose ages were clustered around a certain value.

Now consider a similar question to the traditional birthday problem: If you attend a concert or other event where people tend to be of a similar age, how large would such an event need to be in terms of number of attendees, so that you'd have at least a 50% probability that there would be at least two people born on the exact same day (year, month and day)?

Again to make things specific: assume the standard deviation about whatever the mean age to be is 12 years and it follows a normal distribution.

Again feel free to vary the assumptions about the distribution.

See The Solution Submitted by Charlie    
Rating: 4.5000 (2 votes)

Comments: ( Back to comment list | You must be logged in to post comments.)
One more examination | Comment 4 of 6 |
I was looking to understand the discrepancy between my analysis and simulations. In the last post, I blamed it on not considering conditional probability. But, reexamining the results, I have concluded that the analytic approximation is, in fact, very accurate. 

In hindsight, it is remarkable that no simulation at all (i.e., one employing a normal distribution random number generator) is needed to solve this problem.

Here are the results for running a simulation of "sample" persons where the sample ranges from 7 to 447. The mean is the fraction of groups that had at least 1 shared birthday. The distribution of ages is P(i), a normal PDF with a mean of 60 years, a sigma of 12 years, and cutoffs at 0 and 120 years, with i being the day number. The distribution was renormalized to integrate to exactly 1.0 even with the tiny 5-sigma cutoff. I ran each sample 30,000 times, repeating 5 times to get a mean and sigma. 

My approximation for the probability P of at least 1 birthday for sample n is given by: 
P(n) = 1-P(pair)^(n(n-1)/2),
where P(pair)={sum, i=1,m of p(i)(1-p(i)}
and where m is the number of days in 120 years, m=43825. 
p(i) is the usual gaussian. 

The program is here. In the table below, I compare the simulated mean to the analytic approximation. 1-sigma for the simulation is given in parentheses

sample   7, mean= 0.0014 (0.0001), approx = 0.0013 
sample  27, mean= 0.0218 (0.0008), approx = 0.0222
sample  47, mean= 0.0672 (0.0009), approx = 0.0669
sample  67, mean= 0.1342 (0.0024), approx = 0.1321
sample  87, mean= 0.2143 (0.0023), approx = 0.2132
sample 107, mean= 0.3092 (0.0033), approx = 0.3047
sample 127, mean= 0.4027 (0.0022), approx = 0.4011
sample 147, mean= 0.5005 (0.0019), approx = 0.4972
sample 167, mean= 0.5916 (0.0016), approx = 0.5886
sample 187, mean= 0.6750 (0.0004), approx = 0.6719
sample 207, mean= 0.7465 (0.0013), approx = 0.7449
sample 227, mean= 0.8107 (0.0008), approx = 0.8067
sample 247, mean= 0.8597 (0.0021), approx = 0.8573
sample 267, mean= 0.8999 (0.0016), approx = 0.8973
sample 287, mean= 0.9301 (0.0008), approx = 0.9279
sample 307, mean= 0.9522 (0.0004), approx = 0.9507
sample 327, mean= 0.9669 (0.0009), approx = 0.9671
sample 347, mean= 0.9791 (0.0009), approx = 0.9787
sample 367, mean= 0.9865 (0.0007), approx = 0.9865
sample 387, mean= 0.9915 (0.0004), approx = 0.9917
sample 407, mean= 0.9952 (0.0003), approx = 0.9950

Conditional probabilities can help explain why the approximation is a tiny bit low: strictly speaking, successive addition of more people to the sample leaves fewer "unclaimed" birthdays. But this fraction nearly 1. 
Maximum fraction  =  (43825-407)/43825 = 0.99 for the samples above.

I am not sure why my previous simulation (with a different algorithm) gave the high number of 157 for the 50% probability. 147 is correct. Simulations with the new algorithm agree give the smaller values above.

An exact analytic analysis might require considering all possible current populated birthday configurations and ways adding one more person to yield a match. But this might alternatively take a probabilistic form. Perhaps there is some neat recursion trick to generate the equation governing the addition of each new person. 

Edited on November 21, 2021, 9:51 pm
  Posted by Steven Lord on 2021-11-21 14:58:59

Please log in:
Login:
Password:
Remember me:
Sign up! | Forgot password


Search:
Search body:
Forums (1)
Newest Problems
Random Problem
FAQ | About This Site
Site Statistics
New Comments (17)
Unsolved Problems
Top Rated Problems
This month's top
Most Commented On

Chatterbox:
Copyright © 2002 - 2024 by Animus Pactum Consulting. All rights reserved. Privacy Information