perplexus.info :: Just Math : Monkey Business

Monkey Business (Posted on 2004-04-30)

Suppose someone comes onto the site and is bored. That person starts spreading the rumor that levik is a monkey at exactly noon (on day 1) by sending an e-mail to one random person.

Then, each person sends an e-mail about this rumor (at exactly noon, on day 2) to one person. They can send it to anyone on perplexus (but themselves), even if that person already knows the rumor, or even if it was the person who told them about it.

Each successive day at noon, everyone that knows about the rumor sends a message to one other random person.

On average, on what day will everyone know about the rumor if there are 40 people (including the one that spread the rumor initially) at perplexus while the rumor is still spreading?

What if there were x people at perplexus while the rumor is still spreading?

No Solution Yet Submitted by Gamer

Rating: 3.5000 (4 votes)

Comments: ( Back to comment list | You must be logged in to post comments.)

solution

| Comment 1 of 12

This problem has some similarity to The Random Problem, except in that instance only one person was seeking out a new item (in that case problem, in the present case person), while now each already-selected person chooses another person at random. In "The Random Problem" we produced a Markov chain of seen vs unseen problems, calculating the transition probabilities (having seen x, transitioning to having seen x+1) as we went along.

The transition probabilities (conditional probabilities based on how many already have heard the rumor) for the present case are complicated enough that we should compute them ahead of time and save them in an array. (Oh, yes, did I mention: a computer is involved in the solution.)

Each time a given person selects someone to tell, the probability of the number having heard the rumor increasing by one is (40 - p)/39, where p is the number of people now knowing the rumor. This must be iterated as many times as the number of people who, starting that day, already knew the rumor, but on each iteration we do have to keep track of the current full number of people who know the rumor.

In the program, the array is kept as the row being the number already knowing the rumor and the column being the number knowing the rumor after the day is over. However, for brevity, in the following table (to much less precision than is really kept), the first probability on each line represents the probability of the number staying the same (x -> x); the next column has the probability of increasing by 1; then by 2, etc.:

 2 0.001 0.075 0.924
 3 0.000 0.012 0.202 0.786
 4 0.000 0.003 0.053 0.333 0.611
 5 0.000 0.001 0.017 0.133 0.418 0.432
 6 0.000 0.000 0.006 0.057 0.234 0.427 0.275
 7 0.000 0.000 0.003 0.027 0.130 0.317 0.366 0.157
 8 0.000 0.000 0.001 0.014 0.076 0.220 0.343 0.266 0.079
 9 0.000 0.000 0.001 0.008 0.047 0.153 0.287 0.304 0.165 0.035
10 0.000 0.000 0.001 0.005 0.031 0.110 0.233 0.299 0.221 0.086 0.013
11 0.000 0.000 0.000 0.004 0.022 0.082 0.190 0.277 0.249 0.133 0.038 0.004
12 0.000 0.000 0.000 0.003 0.017 0.065 0.158 0.251 0.258 0.167 0.065 0.014 0.001
13 0.000 0.000 0.000 0.002 0.014 0.053 0.136 0.229 0.257 0.190 0.090 0.026 0.004
14 0.000 0.000 0.000 0.002 0.012 0.046 0.120 0.210 0.251 0.203 0.109 0.038 0.008
15 0.000 0.000 0.000 0.002 0.011 0.042 0.110 0.197 0.244 0.209 0.123 0.049 0.012
16 0.000 0.000 0.000 0.002 0.010 0.039 0.104 0.188 0.238 0.212 0.131 0.056 0.016
17 0.000 0.000 0.000 0.002 0.010 0.039 0.101 0.183 0.234 0.211 0.135 0.061 0.019
18 0.000 0.000 0.000 0.002 0.011 0.040 0.102 0.182 0.231 0.209 0.136 0.062 0.020
19 0.000 0.000 0.000 0.002 0.012 0.042 0.105 0.184 0.230 0.206 0.133 0.061 0.020
20 0.000 0.000 0.000 0.003 0.013 0.046 0.111 0.189 0.230 0.201 0.126 0.057 0.018
21 0.000 0.000 0.000 0.003 0.016 0.052 0.121 0.197 0.230 0.193 0.117 0.051 0.016
22 0.000 0.000 0.001 0.004 0.019 0.060 0.133 0.206 0.229 0.183 0.105 0.043 0.012
23 0.000 0.000 0.001 0.005 0.024 0.071 0.148 0.217 0.227 0.170 0.091 0.034 0.009
24 0.000 0.000 0.001 0.007 0.030 0.086 0.167 0.228 0.221 0.153 0.074 0.025 0.006
25 0.000 0.000 0.002 0.010 0.040 0.104 0.188 0.237 0.210 0.131 0.057 0.017 0.003
26 0.000 0.000 0.002 0.014 0.053 0.128 0.211 0.242 0.192 0.106 0.040 0.010 0.002
27 0.000 0.000 0.004 0.021 0.071 0.156 0.233 0.239 0.167 0.079 0.024 0.005 0.000
28 0.000 0.001 0.006 0.031 0.096 0.190 0.252 0.225 0.134 0.052 0.012 0.002 0.000
29 0.000 0.001 0.011 0.047 0.129 0.226 0.260 0.197 0.096 0.029 0.005 0.000
30 0.000 0.002 0.018 0.071 0.171 0.260 0.253 0.155 0.058 0.012 0.001
31 0.000 0.005 0.030 0.106 0.221 0.282 0.223 0.104 0.026 0.003
32 0.001 0.009 0.052 0.156 0.272 0.281 0.168 0.053 0.007
33 0.001 0.018 0.088 0.222 0.312 0.244 0.098 0.016
34 0.003 0.036 0.147 0.298 0.316 0.166 0.034
35 0.008 0.072 0.236 0.360 0.256 0.068
36 0.020 0.143 0.351 0.358 0.128
37 0.052 0.273 0.447 0.229
38 0.135 0.475 0.390
39 0.363 0.637
40 1.000

I've truncated the right side so as not to upset by too much the formatting on perplexus.

As a sanity check on the above:

If there are already 2 knowers, the probability they each choose each other is (1/39)^2 = 0.0006574621959, which rounds to .001. The probability they each choose the same other person is (38/39)*(1/39) = 0.0249835634451; but there's the same probability that the first person will choose a new person and the second person will choose the first, or vice versa. These three equal number add up to the .075 shown (to the given precision).

On the other end, the probability that the lone non-hearer will be contacted by none of 39 knowers is (38/39)^39 = 0.3631119936942, which value is indeed the one stored internally for the column representing 39 staying 39 the next day.

Here's a probability distribution of the number of knowers by day:
(days 5-7 have been truncated at the right.)

     2
 1 1.000

     2     3     4
 2 0.001 0.075 0.924

     4     5     6     7     8
 3 0.002 0.018 0.108 0.308 0.565

     7     8     9    10    11    12    13    14    15    16
 4 0.001 0.004 0.014 0.042 0.094 0.170 0.237 0.242 0.150 0.045

    12    13    14    15    16    17    18    19    20    21    22    23    24   
 5 0.001 0.002 0.005 0.012 0.026 0.048 0.079 0.115 0.147 0.162 0.151 0.118 0.075

    19    20    21    22    23    24    25    26    27    28    29    30    31   
 6 0.001 0.002 0.005 0.010 0.019 0.033 0.055 0.082 0.111 0.134 0.145 0.136 0.111

    26    27    28    29    30    31    32    33    34    35    36    37    38   
 7 0.001 0.003 0.006 0.013 0.027 0.049 0.082 0.122 0.159 0.176 0.160 0.115 0.061

    31    32    33    34    35    36    37    38    39    40
 8 0.001 0.003 0.008 0.022 0.054 0.113 0.196 0.260 0.235 0.107

    34    35    36    37    38    39    40
 9 0.001 0.003 0.012 0.049 0.158 0.355 0.422

    36    37    38    39    40
10 0.001 0.005 0.040 0.231 0.723

    38    39    40
11 0.007 0.106 0.887

    38    39    40
12 0.001 0.042 0.957

    39    40
13 0.016 0.984

    39    40
14 0.006 0.994

    39    40
15 0.002 0.998

    39    40
16 0.001 0.999

    40
17 1.000

    40
18 1.000

...

    40
42 1.000

    40
43 1.000

    40
44 1.000

where the numbers at the left are the day numbers and the number above each probability being the possible number of people knowing the rumor. Probabilities less than .0005 are not shown. The 1.000 probability of 40 is carried out to day 44 as it takes until then for the probability to reach 1.00000000000000 to 15 significant figures.

The program, keeping track of the expected day on which all 40 know the rumor, comes up with day 9.925828356622571.

The program producing these results is:

CLS
DEFDBL A-Z

OPEN "proboutp.txt" FOR OUTPUT AS #2

totPeople = 40
maxSub = totPeople + 1
DIM recd(maxSub), newRow(maxSub)
'recd contains the prob that that many people have received the rumor
DIM transit(maxSub, maxSub)
' transit(x,y) contains the transition probability that from one day when
'    x people have seen the rumor, the next day y people will have seen it.

FOR tranRow = 1 TO totPeople
  REDIM person(1, maxSub)
  person(0, tranRow) = 1  ' prob = 1 of this number as that is the given
  FOR i = 1 TO tranRow   'each person already w/rumor has a chance to spread
   FOR j = tranRow TO totPeople ' possibilities
    IF j > 2 * tranRow THEN EXIT FOR  ' can't be more than doubled
    person(1, j + 1) = person(1, j + 1) + person(0, j) * (totPeople - j) / (totPeople - 1)
    person(1, j) = person(1, j) + person(0, j) * (j - 1) / (totPeople - 1)
   NEXT j
   FOR j = tranRow TO totPeople
     person(0, j) = person(1, j)
     person(1, j) = 0 'get ready for next generation (person's choice)
   NEXT j
  NEXT i
  FOR j = tranRow TO totPeople
   transit(tranRow, j) = person(0, j)
  NEXT j
NEXT tranRow

FOR x = 1 TO totPeople
 PRINT #2, USING "##"; x;
 FOR y = x TO totPeople
  IF y > 2 * x THEN EXIT FOR
  PRINT #2, USING "##.###"; transit(x, y);
 NEXT
 PRINT #2,
 totline = totline + 1
NEXT

recd(1) = 1
expectation = 1
day = 0
DO
  day = day + 1

  FOR i = 1 TO totPeople
    FOR j = i TO totPeople
     newRow(j) = newRow(j) + recd(i) * transit(i, j)
    NEXT
  NEXT
  FOR j = 1 TO totPeople
    recd(j) = newRow(j)
    newRow(j) = 0
  NEXT j


  FOR i = 0 TO totPeople
   IF recd(i) >= .0005 THEN
    PRINT #2, USING "    ##"; i;
   END IF
  NEXT
  PRINT #2,
  PRINT #2, USING "##"; day;
  FOR i = 0 TO totPeople
   IF recd(i) >= .0005 THEN
    PRINT #2, USING " #.###"; recd(i);
   END IF
  NEXT

  PRINT #2, : PRINT #2,

  expectation = expectation + 1 - recd(40)

LOOP UNTIL ABS(recd(40) - 1) < 1E-15

PRINT #2, expectation
CLOSE

A similar program finds the following for various numbers of people who belong to the group:

   2  1.000000000
   3  2.333333333
   4  3.190789474
   5  3.933115646
   6  4.484990456
   7  4.969990399
   8  5.377911218
   9  5.731734652

  10  6.047132263
  20  8.041985614
  30  9.155141733
  40  9.925828357
  50 10.514640503
  60 10.990574356
  70 11.389952004
  80 11.733712559
  90 12.035541692
 100 12.304523881
 110 12.546969522
 120 12.767661160
 130 12.970249630
 140 13.157468101
 150 13.331421386
 160 13.493828416
 170 13.646139475
 180 13.789570366
 190 13.925120576
 200 14.053606496

As the number of people increases, the day number increases even more slowly than a logarithmic function: When the number of people doubles from 2 to 4, the day number increases by 2.19; when the number of people doubles from 5 to 10, the day number increases by 2.11; when the number of people doubles from 10 to 20, the day number increases by 1.99; from 50 to 100, by 1.79; from 100 to 200, by 1.75. Perhaps it's getting closer to matching a logarithmic curve; however, after 200 the array becomes too large to fit in the memory allocated to DOS.

Posted by Charlie on 2004-04-30 14:29:03

Please log in:

Login:
Password:
Remember me:

Sign up! \| Forgot password

Search:

Search body:

Forums (0)
Newest Problems
Random Problem
FAQ | About This Site
Site Statistics
New Comments (5)
Unsolved Problems
Top Rated Problems
This month's top
Most Commented On

Chatterbox:


perplexus dot info