Let x and y each be random real numbers chosen from the uniform interval (0,1). Call the product of these numbers z.
1) Find the probability that z is in the interval (.1,.2)
2) Find the probability that z is any decimal whose first digit when written in scientific notation is a 1.
Bonus) Find the full probability distribution of the first digit of z.
Note: Give all answers as exact numbers in closed form.
Part 1)
Seek area between xy=.1 and xy=.2 within square (0,0), (1,0), (1,1), (0,1)
y=.1/x y=.2/x
integ{.1 to 1} .1/x dx = [.1*ln(x)]{eval .1 to 1} = -.1*ln(.1)
area = .1 - .1*ln(.1)
(The constant added to the integral accounts for the rectangle to the left of where the integration starts, as all that area counts as both under the curve and within the square.)
integ{.2 to 1} .2/x dx = [.2*ln(x)]{eval .2 to 1} = -.2*ln(.2)
area = .2 - .2*ln(.2)
Area = .2 - .2*ln(.2) - .1 +.1*ln(.1) ~= 0.191629073187416
Numeric verification:
DEFDBL A-Z
DO
x = RND(1)
y = RND(1)
p = x * y
IF p > .1 AND p < .2 THEN hit = hit + 1
ct = ct + 1
PRINT hit; ct, hit / ct
LOOP
leading, at the point stopped, to:
247287 hits out of 1291338 for a hit rate of .191496726650962
BTW: If I hadn't done the numeric verification, I would have forgotten about the rectangles to the left of the area of integration.
Part 2)
Sigma{i = 1 to inf.} (1/10^i * (1 + ln(1/10^i)) - 2/10^i * ln(2/10^i) )
Can this be simplified?
Note that Sigma(1/10^i) is 1/9.
But the other terms are harder to sum, as, having 10^i in the denominator outside the ln function creates geometrically smaller terms, the one inside the ln function creates arithmetically smaller terms. There's a formula for arithmetic-geometric series, used as follows:
We need Sigma(1/10^i * ln(1/10^i) and Sigma(2/10^i * ln(2/10^i)).
The formula given at http://planetmath.org/encyclopedia/ArithmeticGeometricSeries.html for arithmetic-geometric series is
Sigma{1 to inf}(i * q^i) = q / (1 - q)^2
Let's try applying that to Sigma(1/10^i * ln(1/10^i).
The value of q is clearly 1/10. That takes care of the geometric part.
The increments of the arithmetic part are ln(1/10). When i=1 you have one of them; when i=2 you have two of them, etc., so
Sigma(1/10^i * ln(1/10^i) = ln(1/10) * ((1/10)/(9/10)^2) = ln(1/10) * 10^2 / (9^2 * 10) = ln(1/10) * 10/81
Then for Sigma(2/10^i * ln(2/10^i))
or 2 * Sigma(1/10^i * (ln(2) - ln(10^i)))
= 2 * (ln(2) * Sigma(1/10^i) - Sigma((1/10^i) * ln(10^i)) )
So in total we get
1/9 + ln(1/10) * 10/81 - (2 * ln(2) / 9 - 2 * ln(10) * 10/81)
which simplifies further (as ln(1/10) = -ln(10)) to
1/9 - 2 * ln(2) / 9 + 10 * ln(10) / 81 ~= .241348168887179
Bonus)
For a generalized area, using d as the digit and D as one digit higher:
Area = .D - .D*ln(.D) - .d +.d*ln(.d)
and the summation of these slivers of area for a given d:
Sigma{i = 1 to inf.} (1/10^i + d/10^i * ln(d/10^i) - D/10^i * ln(D/10^i) )
where, when d is 9, of course D is allowed to be 10. Note that of course it is here recognized that .D - .d is always .1.
The following program evaluates first 20 terms of the infinite sums, which should be sufficient for the accuracy presented:
DEFDBL A-Z
FOR d = 1 TO 9
dp = d + 1
t = 0
FOR i = 1 TO 20
t = t + 1 / 10 ^ i + d / 10 ^ i * LOG(d / 10 ^ i) - dp / 10 ^ i * LOG(dp / 10 ^ i)
NEXT
sum = sum + t
PRINT USING "# #.###### #.#######"; d; t; sum
NEXT
d p(d) cumulative
1 0.241348 0.2413482
2 0.183209 0.4245577
3 0.145454 0.5700118
4 0.117380 0.6873913
5 0.095007 0.7823981
6 0.076402 0.8587996
7 0.060474 0.9192736
8 0.046549 0.9658224
9 0.034178 1.0000000
The total of the probabilities is indeed 1, so this is reasonable.
Simulation verification:
DEFDBL A-Z
DO
x = RND(1)
y = RND(1)
p = x * y
ps$ = STR$(p)
FOR i = 1 TO LEN(ps$)
ix = INSTR("123456789", MID$(ps$, i, 1))
IF ix > 0 THEN EXIT FOR
NEXT
hit(ix) = hit(ix) + 1
ct = ct + 1
FOR i = 1 TO 9
PRINT hit(i);
NEXT: PRINT
FOR i = 1 TO 9
PRINT USING " #.#####"; hit(i) / ct;
NEXT: PRINT
LOOP
produces, as the last line before being stopped, the following summary of raw counts of leading non-zero digit, and percentage of the total occurrences accounted for by that digit:
91048 69688 55390 44543 36097 28995 23121 17647 12975
0.23991 0.18363 0.14595 0.11737 0.09512 0.07640 0.06092 0.04650 0.03419
This agrees well with the theory.
Now let's get rid of the Sigmas in
Sigma{i = 1 to inf.} (1/10^i + d/10^i * ln(d/10^i) - D/10^i * ln(D/10^i) )
for the generalized probability of beginning with digit d, where D = d + 1.
The three terms can be summed separately, and the first one evaluates to 1/9. The second and third are identical to each other except that one uses d and the other uses D.
Sigma(d/10^i * ln(d/10^i))
= d * Sigma(1/10^i * ln(d/10^i))
= d * Sigma(1/10^i * (ln(d) - i * ln(10))
= d * Sigma(ln(d)/10^i) - d * Sigma(1/10^i * i * ln(10))
= d * ln(d) / 9 - d * ln(10) * Sigma(i/10^i)
= d * ln(d) / 9 - d * ln(10) * (1/10)/(9/10)^2
= d * ln(d) / 9 - d * ln(10) * 10/81
and since the other term is the same with D replacing d,
Sigma(D/10^i * ln(D/10^i)) = D * ln(D) / 9 - D * ln(10) * 10/81
Using the three terms in the original formula:
1/9 + d * (ln(d)/9 - 10 * ln(10)/81) - D * (ln(D)/9 - 10 * ln(10)/81)
= 1/9 + d * ln(d)/9 - D * ln(D)/9 + 10 * (D-d) * ln(10)/81
= (1 + d * ln(d) - D * ln(D)) / 9 + 10 * ln(10) / 81, since D-d = 1
or
(1 + d * ln(d) - (d+1) * ln(d+1)) / 9 + 10 * ln(10) / 81
To see if the results are the same:
DEFDBL A-Z
FOR d = 1 TO 9
prob = (1 + d * LOG(d) - (d + 1) * LOG(d + 1)) / 9 + 10 * LOG(10) / 81
PRINT USING "## #.#######"; d; prob
NEXT
produces
1 0.2413482
2 0.1832095
3 0.1454541
4 0.1173795
5 0.0950067
6 0.0764015
7 0.0604741
8 0.0465488
9 0.0341776
in agreement with evaluating the series as series.
(The LOG function in Basic is the natural log.)
|
Posted by Charlie
on 2009-08-17 18:53:29 |