Suppose an illness that can affect 1% of the people. Also assume that there is a test for that illness, that gives the correct result 99% of the times.
If you take that test, and receive a POSITIVE result, should you worry much?
If you take it again, and once more get a POSITIVE, should you worry then?
How many consecutive POSITIVEs would you have to get in order to be sure that the chances of a wrong diagnostic are 1 in a million?
(In reply to
re(2): solution by Bob)
There are actually two probabilites being discussed here. One is the overall probability of a test result being accurate. This is the 99% chance that Penny refers to. But once we move to talking about a known positive result, the probability of that particular test being accurate is 50%, as Charlie and Eric correctly conclude.
Charlie and Eric give a concrete example of how this can be calculated. In more general terms this is how it works (for those who are interested  if not you can take Charlie and Eric's word for it, they are correct  you can skip to the bottom of this post to see the solution to the consecutive positives part of the question):
To say that a positive test result is accurate is the same thing as saying the person has the illness. So what we want to know is given the fact that the test result is positive, what is the probability that I am ill? This is known as a conditional probability.
What information is given in the problem itself? First of all we know the overall probability that one has the illness. Let B1 = "Person has illness" and B2 = "Person doesn't have illness." Standard notation is P(event) = x, meaning the probability of a certain event is "x." So we know that P(B1) = 0.01 and P(B2) = 0.99. We also know the probability that a test is accurate: P(accurate) = 0.99.
But going into a test, what is the probability that it will be positive? It depends on whether or not the person has the disease, right? Another conditional probability, but not quite the one we are after. We do need to calculate it, however, to get to that final answer. Let A = "Positive Test." The notation for a conditional probablity is P(event 1  event 2) = x. This means "the probability of event 1 given event 2 has happened or is true is x." Every person either has the disease or doesn't, both B1 and B2 can't be true for one individual. This is important because it allows us to calculate P(A) by the following:
P(A) = contribution to overall prob by those with illness + contribuiont to overall prob by those without illness...
P(A) = P(AB1)*P(B1) + P(AB2)*P(B2)
We know P(B1) = 0.01 and P(B2) = 0.99 as mentioned earlier. Also given in the problem is P(AB1) = P(positive result given the person has the illness) = P(accurate test) = 0.99; and P(AB2) = P(positive result given no illness) = P(inaccurate test) = 0.01.
So P(positive result) = P(A) = 0.99*0.01 + 0.01*0.99 = 0.0198, or just under 2%.
(The fact that 99% and 1% are used makes this a little more confusing, as the first 0.99 has to do with accuracy and the second 0.99 refers to not having the disease).
Finally, what we really want to know is P(B2A) = P(have the illness given the result is positive). In short, this is equal to the proportion of correctly positive tests out of all possible positive tests. All possible positive tests include both accurate and inaccurate positive tests. Those who have an interest in probability can refer to Bayes' theorem to see how this is developed. But as there is a very small probability that anyone is still reading this, let's get to the answer:
P(accurate positive tests) = P(AB1)*P(B1)=0.99*0.01 = .0099.
P(all positive test) = see above P(A) = 0.0198.
P(have illnesspositive result) = P(accurate positive) / P(all positive) = .0099 / .0198 = 0.5 !!!! Just as Charlie and Eric said all along! And 50% could certainly be cause for worrying.
Now to the question of how many consecutive POSITIVES would you have to get in order to be sure that the chances of a wrong diagnostic are 1 in a million.
Using Charlie's assumption that each test is independent of all previous tests, we are looking at an intersection of conditional probabilities. The probability of an intersection of independent events is calculated by multiplying the individual probabilities. So the probability of a wrong diagnostic (no illness) for two consecutive tests is 0.5 * 0.5 = 0.25. For 3 it is 0.5 to the third power = 0.125, etc. The answer to the problem is the answer to this question: 0.5 to what power is less than 0.000001 (1 in a million)? (Let ^ be the power function) This problem is solved by taking the natural logarithm of both sides.
0.5^X = 0.000001
X*ln(0.5) = ln(0.000001)
X = ln(0.000001) / ln(0.5) = 19.93
Therefore, we round up to get the answer of 20 consecutive positive results requires to be sure the chances of error are 1 in a million.

Posted by Kyle
on 20050103 21:15:43 