Hermetic Word Frequency Counter Advanced Version
Sorting Documents by the Number of
Occurrences of a Word or Phrase

Problem: You have several or many documents, and you want to identify those that have the most occurrences of a particular word or phrase.
Solution: Run this program on those documents using the report format "word file-list(+freq)" and specifying the word or phrase in the "Extra count-only words/phrases" textbox in the WFCA Settings window. The result will be a list of files ordered by frequency of occurrence of that word or phrase.


Sounds easy? It is. Here's an example.

Carroll Quigley's 1966 book Tragedy and Hope: A History of the World in Our Time can be downloaded here. It has 21 chapters. Suppose we are interested in the 1st and 2nd World Wars, and wish to know which chapters mention them, and which mention them most. We put the 21 chapters into a subfolder and set the "Extra count-only words/phrases" textbox as shown:

We click on the "Count words/phrases" button, then on the "Count only specified words/phrases" button

and after a couple of minutes the result appears:


Which tells us that the chapter "05.html" has the most occurrences (7) of "first world war". And scrolling down tells us that the chapter "15.html" has the most occurrences (3) of "second world war". We can also see which chapters do not contain these phrases.

Introduction User Manual: Contents
Hermetic Systems Home Page