There are currently 118 elements in the Periodic Table, each Symbol consisting of one or two letters.
Some words can be constructed from the set of these Symbols and some cannot.
For example "calculus" has two "elemental representations":
C Al C U Lu S
C Al Cu Lu S
Moreover we will add one point to a word's "score" for every letter in the word which can appear as the first letter of a two-letter symbol and also as the second letter of a two-letter symbol in different valid elemental representations of the same word.
The word "calculus" scores zero points because the only difference between its two reprentations is C U vs Cu.
On the other hand, "snow" can be:
S N O W
Sn O W
S No W
Since the letter "n" appears both in Sn and No, "snow" gets one point.
Your challenge: If k is the maximum score of the set of all words in the English language, try to find an example word for each score from 1 to k.
To show a more complicated example, I'm assuming:
P Re S S Es
P Re S Se S
Pr Es S Es
Pr Es Se S
Score is 19, based on
lower case row
\ 1 2 3 4
cap
row
1 0 1 1 2
2 1 0 2 2
3 2 2 0 2
4 2 1 1 0
For example row 4 has two Es's, so it counts for 2 based on row 1's Re, as each Capital in a 2-letter symbol counts. However, each lower case only counts for one, as, for example row 2 has an Re that matches with a Pr in row 3, and row 2 has an Se, that matches both the Es symbols in row 3, but this contributes only 1 for a total of 2 based on row 2's capitals and row 3's lower cases (counting only 2-letter symbols).
This is based on the wording "every letter in the word which can appear as the first letter of a two-letter symbol" counts so long as it appears as lower case in the other representation.
The largest score found was 4592, for innocuousnesses.
In N O C U O U S N Es S Es
In N O C U O U S N Es Se S
In N O C U O U S Ne S S Es
In N O C U O U S Ne S Se S
In N O C U O U Sn Es S Es
In N O C U O U Sn Es Se S
In N O Cu O U S N Es S Es
In N O Cu O U S N Es Se S
In N O Cu O U S Ne S S Es
In N O Cu O U S Ne S Se S
In N O Cu O U Sn Es S Es
In N O Cu O U Sn Es Se S
In No C U O U S N Es S Es
In No C U O U S N Es Se S
In No C U O U S Ne S S Es
In No C U O U S Ne S Se S
In No C U O U Sn Es S Es
In No C U O U Sn Es Se S
In No Cu O U S N Es S Es
In No Cu O U S N Es Se S
In No Cu O U S Ne S S Es
In No Cu O U S Ne S Se S
In No Cu O U Sn Es S Es
In No Cu O U Sn Es Se S
I N N O C U O U S N Es S Es
I N N O C U O U S N Es Se S
I N N O C U O U S Ne S S Es
I N N O C U O U S Ne S Se S
I N N O C U O U Sn Es S Es
I N N O C U O U Sn Es Se S
I N N O Cu O U S N Es S Es
I N N O Cu O U S N Es Se S
I N N O Cu O U S Ne S S Es
I N N O Cu O U S Ne S Se S
I N N O Cu O U Sn Es S Es
I N N O Cu O U Sn Es Se S
I N No C U O U S N Es S Es
I N No C U O U S N Es Se S
I N No C U O U S Ne S S Es
I N No C U O U S Ne S Se S
I N No C U O U Sn Es S Es
I N No C U O U Sn Es Se S
I N No Cu O U S N Es S Es
I N No Cu O U S N Es Se S
I N No Cu O U S Ne S S Es
I N No Cu O U S Ne S Se S
I N No Cu O U Sn Es S Es
I N No Cu O U Sn Es Se S
The numbers in the following table add up to the 4592 figure, and can verify the pairwise contributions to the total. The row number is the capital-letter source line number and the column number is the lower-case source line number.
0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2
1 0 2 1 1 2 1 2 2 1 1 2 1 2 2 1 1 2 1 2 2 1 1 2 1 2 2 1 1 2 1 2 2 1 1 2 1 2 2 1 1 2 1 2 2 1 1 2
1 2 0 2 1 2 1 2 2 2 1 2 1 2 2 2 1 2 1 2 2 2 1 2 0 1 1 1 1 2 0 1 1 1 1 2 0 1 1 1 1 2 0 1 1 1 1 2
2 2 2 0 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 2 1 2 2 1 1 1 0 2 2 1 1 1 0 2 2 1 1 1 0 2 2 1 1 1 0 2 2
1 3 3 2 0 3 1 3 3 2 1 3 1 3 3 2 1 3 1 3 3 2 1 3 1 3 3 2 1 3 1 3 3 2 1 3 1 3 3 2 1 3 1 3 3 2 1 3
2 3 3 1 2 0 2 3 3 1 2 3 2 3 3 1 2 3 2 3 3 1 2 3 2 3 3 1 2 3 2 3 3 1 2 3 2 3 3 1 2 3 2 3 3 1 2 3
0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2
1 2 2 1 1 2 1 0 2 1 1 2 1 2 2 1 1 2 1 2 2 1 1 2 1 2 2 1 1 2 1 2 2 1 1 2 1 2 2 1 1 2 1 2 2 1 1 2
1 2 2 2 1 2 1 2 0 2 1 2 1 2 2 2 1 2 1 2 2 2 1 2 0 1 1 1 1 2 0 1 1 1 1 2 0 1 1 1 1 2 0 1 1 1 1 2
2 2 2 1 2 2 2 2 2 0 2 2 2 2 2 1 2 2 2 2 2 1 2 2 1 1 1 0 2 2 1 1 1 0 2 2 1 1 1 0 2 2 1 1 1 0 2 2
1 3 3 2 1 3 1 3 3 2 0 3 1 3 3 2 1 3 1 3 3 2 1 3 1 3 3 2 1 3 1 3 3 2 1 3 1 3 3 2 1 3 1 3 3 2 1 3
2 3 3 1 2 3 2 3 3 1 2 0 2 3 3 1 2 3 2 3 3 1 2 3 2 3 3 1 2 3 2 3 3 1 2 3 2 3 3 1 2 3 2 3 3 1 2 3
1 3 3 3 1 3 1 3 3 3 1 3 0 3 3 3 1 3 1 3 3 3 1 3 0 2 2 2 1 3 0 2 2 2 1 3 0 2 2 2 1 3 0 2 2 2 1 3
2 3 3 2 2 3 2 3 3 2 2 3 2 0 3 2 2 3 2 3 3 2 2 3 1 2 2 1 2 3 1 2 2 1 2 3 1 2 2 1 2 3 1 2 2 1 2 3
2 3 3 3 2 3 2 3 3 3 2 3 2 3 0 3 2 3 2 3 3 3 2 3 0 1 1 1 2 3 0 1 1 1 2 3 0 1 1 1 2 3 0 1 1 1 2 3
3 3 3 2 3 3 3 3 3 2 3 3 3 3 3 0 3 3 3 3 3 2 3 3 1 1 1 0 3 3 1 1 1 0 3 3 1 1 1 0 3 3 1 1 1 0 3 3
2 4 4 3 2 4 2 4 4 3 2 4 2 4 4 3 0 4 2 4 4 3 2 4 1 3 3 2 2 4 1 3 3 2 2 4 1 3 3 2 2 4 1 3 3 2 2 4
3 4 4 2 3 4 3 4 4 2 3 4 3 4 4 2 3 0 3 4 4 2 3 4 2 3 3 1 3 4 2 3 3 1 3 4 2 3 3 1 3 4 2 3 3 1 3 4
1 3 3 3 1 3 1 3 3 3 1 3 1 3 3 3 1 3 0 3 3 3 1 3 0 2 2 2 1 3 0 2 2 2 1 3 0 2 2 2 1 3 0 2 2 2 1 3
2 3 3 2 2 3 2 3 3 2 2 3 2 3 3 2 2 3 2 0 3 2 2 3 1 2 2 1 2 3 1 2 2 1 2 3 1 2 2 1 2 3 1 2 2 1 2 3
2 3 3 3 2 3 2 3 3 3 2 3 2 3 3 3 2 3 2 3 0 3 2 3 0 1 1 1 2 3 0 1 1 1 2 3 0 1 1 1 2 3 0 1 1 1 2 3
3 3 3 2 3 3 3 3 3 2 3 3 3 3 3 2 3 3 3 3 3 0 3 3 1 1 1 0 3 3 1 1 1 0 3 3 1 1 1 0 3 3 1 1 1 0 3 3
2 4 4 3 2 4 2 4 4 3 2 4 2 4 4 3 2 4 2 4 4 3 0 4 1 3 3 2 2 4 1 3 3 2 2 4 1 3 3 2 2 4 1 3 3 2 2 4
3 4 4 2 3 4 3 4 4 2 3 4 3 4 4 2 3 4 3 4 4 2 3 0 2 3 3 1 3 4 2 3 3 1 3 4 2 3 3 1 3 4 2 3 3 1 3 4
0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2
1 2 2 1 1 2 1 2 2 1 1 2 1 2 2 1 1 2 1 2 2 1 1 2 1 0 2 1 1 2 1 2 2 1 1 2 1 2 2 1 1 2 1 2 2 1 1 2
1 2 2 2 1 2 1 2 2 2 1 2 1 2 2 2 1 2 1 2 2 2 1 2 0 1 0 1 1 2 0 1 1 1 1 2 0 1 1 1 1 2 0 1 1 1 1 2
2 2 2 1 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 2 1 2 2 1 1 1 0 2 2 1 1 1 0 2 2 1 1 1 0 2 2 1 1 1 0 2 2
1 3 3 2 1 3 1 3 3 2 1 3 1 3 3 2 1 3 1 3 3 2 1 3 1 3 3 2 0 3 1 3 3 2 1 3 1 3 3 2 1 3 1 3 3 2 1 3
2 3 3 1 2 3 2 3 3 1 2 3 2 3 3 1 2 3 2 3 3 1 2 3 2 3 3 1 2 0 2 3 3 1 2 3 2 3 3 1 2 3 2 3 3 1 2 3
0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2 0 2 2 2 0 2
1 2 2 1 1 2 1 2 2 1 1 2 1 2 2 1 1 2 1 2 2 1 1 2 1 2 2 1 1 2 1 0 2 1 1 2 1 2 2 1 1 2 1 2 2 1 1 2
1 2 2 2 1 2 1 2 2 2 1 2 1 2 2 2 1 2 1 2 2 2 1 2 0 1 1 1 1 2 0 1 0 1 1 2 0 1 1 1 1 2 0 1 1 1 1 2
2 2 2 1 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 2 1 2 2 1 1 1 0 2 2 1 1 1 0 2 2 1 1 1 0 2 2 1 1 1 0 2 2
1 3 3 2 1 3 1 3 3 2 1 3 1 3 3 2 1 3 1 3 3 2 1 3 1 3 3 2 1 3 1 3 3 2 0 3 1 3 3 2 1 3 1 3 3 2 1 3
2 3 3 1 2 3 2 3 3 1 2 3 2 3 3 1 2 3 2 3 3 1 2 3 2 3 3 1 2 3 2 3 3 1 2 0 2 3 3 1 2 3 2 3 3 1 2 3
1 3 3 3 1 3 1 3 3 3 1 3 1 3 3 3 1 3 1 3 3 3 1 3 0 2 2 2 1 3 0 2 2 2 1 3 0 2 2 2 1 3 0 2 2 2 1 3
2 3 3 2 2 3 2 3 3 2 2 3 2 3 3 2 2 3 2 3 3 2 2 3 1 2 2 1 2 3 1 2 2 1 2 3 1 0 2 1 2 3 1 2 2 1 2 3
2 3 3 3 2 3 2 3 3 3 2 3 2 3 3 3 2 3 2 3 3 3 2 3 0 1 1 1 2 3 0 1 1 1 2 3 0 1 0 1 2 3 0 1 1 1 2 3
3 3 3 2 3 3 3 3 3 2 3 3 3 3 3 2 3 3 3 3 3 2 3 3 1 1 1 0 3 3 1 1 1 0 3 3 1 1 1 0 3 3 1 1 1 0 3 3
2 4 4 3 2 4 2 4 4 3 2 4 2 4 4 3 2 4 2 4 4 3 2 4 1 3 3 2 2 4 1 3 3 2 2 4 1 3 3 2 0 4 1 3 3 2 2 4
3 4 4 2 3 4 3 4 4 2 3 4 3 4 4 2 3 4 3 4 4 2 3 4 2 3 3 1 3 4 2 3 3 1 3 4 2 3 3 1 3 0 2 3 3 1 3 4
1 3 3 3 1 3 1 3 3 3 1 3 1 3 3 3 1 3 1 3 3 3 1 3 0 2 2 2 1 3 0 2 2 2 1 3 0 2 2 2 1 3 0 2 2 2 1 3
2 3 3 2 2 3 2 3 3 2 2 3 2 3 3 2 2 3 2 3 3 2 2 3 1 2 2 1 2 3 1 2 2 1 2 3 1 2 2 1 2 3 1 0 2 1 2 3
2 3 3 3 2 3 2 3 3 3 2 3 2 3 3 3 2 3 2 3 3 3 2 3 0 1 1 1 2 3 0 1 1 1 2 3 0 1 1 1 2 3 0 1 0 1 2 3
3 3 3 2 3 3 3 3 3 2 3 3 3 3 3 2 3 3 3 3 3 2 3 3 1 1 1 0 3 3 1 1 1 0 3 3 1 1 1 0 3 3 1 1 1 0 3 3
2 4 4 3 2 4 2 4 4 3 2 4 2 4 4 3 2 4 2 4 4 3 2 4 1 3 3 2 2 4 1 3 3 2 2 4 1 3 3 2 2 4 1 3 3 2 0 4
3 4 4 2 3 4 3 4 4 2 3 4 3 4 4 2 3 4 3 4 4 2 3 4 2 3 3 1 3 4 2 3 3 1 3 4 2 3 3 1 3 4 2 3 3 1 3 0
clearvars,clc
global names symbols build ways
fid=fopen('c:\vb5 projects\flooble\chemical elements.txt','r');
names=string.empty; symbols=string.empty;
while ~feof(fid)
l=fgetl(fid);
f=find(l==' ');
symbols(end+1)=string(l(f(1)+1:f(2)-1));
names(end+1)=string(l(f(2)+1:f(3)-1));
if names(end)==""
names(end)=[];
symbols(end)=[];
end
end
fclose(fid);
names=string.empty;
fid=fopen('c:\words\words.txt','r');
while ~feof(fid)
names(end+1)=fgetl(fid);
if ~isequal(names(end),lower(names(end)))
names(end)=[];
break
end
end
sets={}; counts=[];
count=0;
for i=1:length(names)
build='';
ways={};
count=0;
addon(i,1);
if length(ways)>1
for j=1:length(ways)
Rep=ways{j};
for k=1:length(ways)
if j~=k
rep=ways{k};
for psn=1:length(Rep)-1
if Rep(psn)==upper(Rep(psn)) && ...
Rep(psn+1)>='a' && Rep(psn+1)<='z'
srch=lower(Rep(psn));
f=strfind(rep,srch);
if ~isempty(f)
count=count+1;
end
end
end
end
end
end
end
if count>0
sets{end+1}=ways;
counts(end+1)=count;
end
end
function addon(i,wh)
global names symbols build ways
if wh==1
build='';
end
rest= lower(extractBetween(names(i),wh,length(char(names(i)))));
for j=1:length(symbols)
trial= lower(symbols(j));
f=strfind(rest,trial);
if ~isempty(f)
if f(1)==1
build=[ build char(symbols(j))];
if length(build)==strlength(names(i))
% disp([build ' ' char(names(i))])
ways{end+1}='';
for k=1:length(build)
if build(k)==upper(build(k))
ways{end}=[ways{end} sprintf('%s',' ')];
end
ways{end}=[ways{end} sprintf('%s',build(k))];
end
else
addon(i,wh+strlength(trial));
end
build=extractBefore(build,length(build)-strlength(trial)+1);
end
end
end
end
That program was followed by the following script while the counts and sets arrays were still in the workspace:
clc
[sCts,idx]=sort(counts);
sNames=sets([idx]);
prev=0;
for i=1:length(sCts)
if sCts(i)>prev
if sCts(i)-prev>1
fprintf('%12s %4d\n','------------',sCts(i)-prev-1)
end
ctct=0;
end
ctct=ctct+1;
if ctct<3
disp(sCts(i))
ss=sNames{i};
for j=1:length(ss )
disp(ss{j})
end
end
prev=sCts(i);
end
An example:
30 lower-case matches
B I O S C O P Y
B I O S Co P Y 2 Sc below
B I O Sc O P Y 4 Os below
B I Os C O P Y 3 Co below
B I Os Co P Y 2 Co 2 Sc (its own Co doesn't count)
Bi O S C O P Y
Bi O S Co P Y 4 Os 2 Sc
Bi O Sc O P Y 4 Os
Bi Os C O P Y 3 Co
Bi Os Co P Y 2 Co 2 Sc (its own Co doesn't count)
In the latter part of the below list (after the 91st value) a maximum of 2 words are analyzed for any given number value. Before that, a single word chosen for its greater commonness is shown for each score value.
The first gap is for a score of 89; no word was found with this score. Gaps are marked with strings of hyphens. The number of missing counts is shown. The first gap of more than one missing word is 117 and 118; no word has either of these scores.
For the first 91 values of score, I have replaced the chosen first two words (the choice of the program) with a single word chosen by me to be more common.
I'll post the list separately.
|
Posted by Charlie
on 2022-07-15 11:17:34 |