Hermetic Word Frequency Counter Advanced Version
Filter Found Phrases

Suppose we have a file containing Chapter 3 of George Orwell's novel , in which Eastasia, Eurasia and Oceania are perpetually at war. We can easily count the occurrences of these names by specifying "Eastasia, Eurasia, Oceania" in the extra count-only words text box at the Settings panel, then clicking on "Count words/phrases" then on "Count only specified words/phrases" to obtain:

Suppose now we wish to find all phrases which contain all three names. First we allow only numerals, commas, apostrophes and hyphens in words (in the Settings panel). Then we set order to "alphabetical" and format to "frequency word/phrase". Then we click on "Count all phrases" and set up the operation as follows:

Click on "Count phrases" and the program proceeds to find over 20,000 phrases, but after eliminating redundant subphrases it reduces these to 694:

Now we filter the results as shown below:

Filter results

to obtain:

To get each phrase on a single line first close (not quit) the "Count all phrases" window then widen the main window:

To illustrate the use of the options, suppose we have a file with a list of phrases (one per line) some of whch include 'shoes', such as:

red tennis shoes
green tennis shoes
old hiking boots
buy leather shoes
buy red tennis shoes
buy green leather shoes
rent or buy hiking shoes or boots
don't buy red leather shoes
buy leather shoes
buy green or white shoes or slippers
green or white shoes

After allowing the apostrophe in words, and setting the program to terminate a phrase at the end of a line, counting phrases of length from 3 to 8 words produces:

To obtain just the phrases which have 'buy' followed somewhere (not necessarily immediately) by 'shoes' We set up the filter words and option as follows:

Clicking on "Filter results" then produces:

If we had filtered the phrases using "as subphrase" then no phrases would have been displayed because "buy shoes" does not occur as a subphrase.

As long as the window with the "Filter results" button remains open you can change the filter options and filter words (and then filter the results), without re-scanning the file(s), provided you keep the option regarding words-to-ignore. If you change that then you will need to click on the "Count phrases" button to re-scan the file.

Introduction User Manual: Contents
Hermetic Systems Home Page