There are two parcels, one marked CALCUTTA and one marked TATANAGAR which get lost in transit.
One parcel is found and has the label partly torn (the label can be torn in any manner). It only has the letters 'TA' (adjacent letters).
What is the probability that the recovered parcel was marked CALCUTTA?
A priori, the chance that the recovered package was marked CALCUTTA is 50%.
But the recovered 'TA' is additional evidence that can be factored in using Bayesian analysis.
CALCUTTA has 7 pairs of 2 adjacent letters, only one of which is 'TA'. The p('TA'|'CALCUTTA') = 1/7
But The p('TA'|'TATANAGAR') = 2/8
Using Bayesian analysis,
p('CALCUTTA'|'TA') = p('TA'|'CALCUTTA')*p('CALCUTTA')
divided by ( p('TA'|'CALCUTTA')*p('CALCUTTA') +
(1/7)(1/2) / ((1/7)(1/2) + (2/8)*(1/2)) = 4/11, and that's my answer.
Note that the double 'TA' is not the only issue. The relative length of the words factors in also. If the lost packages were marked 'CALCUTTA' and 'TALLAHASSEE', I would say that the retrieved package is more likely to be 'CALCUTTA'. with probability 10/17, even though both have only one 'TA' in their name.