Repetitions in frequency-alpha-alldicts.txt #5

bszollosinagy · 2023-05-18T00:09:35Z

The word "ascetic" exists more than once in the file: once at rank 18614, then at rank 25054, and also ranks 63318 and 104505.

The word "copious" and "verdant" are also duplicated for some reason.

Can the counts be simply summed across all occurrences?

hackerb9 · 2023-09-12T22:56:32Z

$ grep ascetic frequency-alpha-alldicts.txt 
18614      ascetic                      2,875,469    0.000199%   97.305329%
25054      asceticism                   1,605,339    0.000111%   98.265396%
63318      ascetical                      153,464    0.000011%   99.760632%
104505     ascetically                     24,997    0.000002%   99.955170%

It would be nice to be able to merge different forms of the same root together, as a dictionary does, but that information is not included in the Google corpus.

Do you know of any database I could use for such merging? I'm not going to write an automatic algorithm for it as it'd end up merging "cop" with "copy" and "copious".

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repetitions in frequency-alpha-alldicts.txt #5

Repetitions in frequency-alpha-alldicts.txt #5

bszollosinagy commented May 18, 2023

hackerb9 commented Sep 12, 2023

Repetitions in frequency-alpha-alldicts.txt #5

Repetitions in frequency-alpha-alldicts.txt #5

Comments

bszollosinagy commented May 18, 2023

hackerb9 commented Sep 12, 2023