Part 1:
Let f(s,0) be the number of subsets drawn from an alphabet of s letters (let them be the first s letters of the English alphabet) that do not contain the sth letter, and f(s,1) be the number that do contain the sth letter.
f(1,0)=1 and f(1,1)=1, adding to 2 (the null set and {a}).
f(n+1,0)=f(n,0)+f(n,1) as you are not utilizing the new sth (=n+1st) letter.
f(n+1,1)=f(n,0) as you can't use the letter after the last one used.
s f(s,0) f(s,1) total
1 1 1 2
2 2 1 3
3 3 2 5
4 5 3 8
5 8 5 13
6 13 8 21
and we see the Fibonacci numbers emerging.
For s=1 we get 2, the 3rd Fibonacci number, so we seek the 28th Fibonacci number, 317,811, found on the list in Sloane's OEIS. Remember that this total includes the null set.
Part 2:
The early version, with alphabet size (sz) at a constant 26, took way too long to finish. At present it even balks at sz=11.
DECLARE SUB addOn ()
CLEAR , , 25000
DEFDBL A-Z
DIM SHARED s$, ct, sz
FOR sz = 1 TO 11
ct = 0
addOn
PRINT sz, ct
NEXT
SUB addOn
FOR i = 0 TO sz
IF i = 0 THEN
' PRINT s$
ct = ct + 1 ' terminate string here
ELSE
c$ = CHR$(96 + i)
IF INSTR(s$, c$) = 0 THEN
good = 1
IF LEN(s$) > 0 THEN
IF ABS(ASC(c$) - ASC(RIGHT$(s$, 1))) = 1 THEN good = 0
END IF
IF good THEN
s$ = s$ + c$
addOn
s$ = LEFT$(s$, LEN(s$) - 1)
END IF
END IF
END IF
NEXT
END SUB
alphabet number of
size ordered subsets
1 2
2 3
3 6
4 17
5 70
6 379
7 2498
8 19185
9 167514
10 1635971
putting the first few numbers into Sloane's OEIS produces no result.
As a check that all such strings are counted, and only those strings the counted strings were printed out for size 4 alphabet:
a
ac
ad
adb
b
bd
bda
bdac
c
ca
cad
cadb
d
da
dac
db
which includes a null string at the top, to make the 17 reported.
With a modification to report substrings of various sizes that make up the total count, we get (the total appears at the extreme right):
1 1 1 2
2 1 2 0 3
3 1 3 2 0 6
4 1 4 6 4 2 17
5 1 5 12 18 20 14 70
6 1 6 20 48 90 124 90 379
7 1 7 30 100 272 582 860 646 2498
8 1 8 42 180 650 1928 4386 6748 5242 19185
9 1 9 56 294 1332 5110 15912 37566 59612 47622 167514
10 1 10 72 448 2450 11604 46250 148648 360642 586540 479306 1635971
11 1 11 90 648 4160 23534 114900 470230 1547272 3835230 6362636 5296790 17655502
each row being in order: the size of the alphabet, count of zero-length strings, length-1 strings, etc. and finally the total count.
Going to the use of QB64 to speed things up, we can get
12 1 12 110 900 6642 43792 253890 1267236 5291450 17737608 44735778 63779034 208554913
The only patterns I see in this array are:
Let n(a,s) be the number of length-s valid strings formed from an alphabet of size a.
then
n(a,0)=1
n(a,1)=a
n(a,2)=n(a-2,1)*n(a-1,1)=(a-2)*(a-1)
n(a,3)=n(a-2,1)*n(a-1,2)=(a-2)*(a-3)*(a-2)
The ways of rearranging all the letters (n(a,a)) is given by the OEIS's A002464. That sequence for various a is:
4 2
5 14
6 90
7 646
8 5242
9 47622
10 479306
11 5296790
12 63779034
13 831283558
14 11661506218
15 175203184374
16 2806878055610
17 47767457130566
18 860568917787402
19 16362838542699862
20 327460573946510746
21 6880329406055690790
22 151436547414562736234
23 3484423186862152966838
24 83655126041771262574458
25 2092014180086865279171334
26 54406969991009281966468810
(based on the formula given in Sloane: a(n)=(n+1)*a(n-1)-(n-2)*a(n-2)-(n-5)*a(n-3)+(n-3)*a(n-4), where their a(n) is my n(a,a).)
So the number n(26,26) is 54,406,969,991,009,281,966,468,810, which counts only the words that use all 26 of the letters of the alphabet. We can't utilize this in combination with lower numbers, such as 26*2092014180086865279171334 representing 25-letter strings, as the 2092014180086865279171334 is for a continuous alphabet of 25 letters; a subset of 25 from all 26 has a gap in the middle that affords greater flexibility, in allowing for example, K next to M in the "alphabet" lacking an L. So each of the 26 25-letter subsets that can be used as strings in a 26-letter alphabet would have more than 2092014180086865279171334 allowable permutations.
Via Sloane, the column n(a,3) is also identified as (a-2)^3-(a-2)^2, which agrees with the above (a-2)*(a-3)*(a-2). Column (a,4) comes out to (a-3)^2*((a-3)^2+1), which is also (a-3)^4 + (a-3)^2.
|
Posted by Charlie
on 2014-03-25 15:41:33 |