Hermetic Word Frequency Counter Advanced Version Filter Found Phrases
Suppose we have a file containing Chapter 3 of George Orwell's novel , in which Eastasia, Eurasia and Oceania are perpetually at war. We can easily count the occurrences of these names by specifying "Eastasia, Eurasia, Oceania" in the extra count-only words text box at the Settings panel, then clicking on "Count words/phrases" then on "Count only specified words/phrases" to obtain:
Suppose now we wish to find all phrases which contain all three names. First we allow only numerals, commas, apostrophes and hyphens in words (in the Settings panel). Then we set order to "alphabetical" and format to "frequency word/phrase". Then we click on "Count all phrases" and set up the operation as follows:
Click on "Count phrases" and the program proceeds to find over 20,000 phrases, but after eliminating redundant subphrases it reduces these to 694:
Now we filter the results as shown below:
To get each phrase on a single line first close (not quit) the "Count all phrases" window then widen the main window:
To illustrate the use of the options, suppose we have a file with a list of phrases (one per line) some of whch include 'shoes', such as:
red tennis shoes green tennis shoes old hiking boots buy leather shoes buy red tennis shoes buy green leather shoes rent or buy hiking shoes or boots don't buy red leather shoes buy leather shoes buy green or white shoes or slippers green or white shoes
After allowing the apostrophe in words, and setting the program to terminate a phrase at the end of a line, counting phrases of length from 3 to 8 words produces:
To obtain just the phrases which have 'buy' followed somewhere (not necessarily immediately) by 'shoes' We set up the filter words and option as follows:
Clicking on "Filter results" then produces:
If we had filtered the phrases using "as subphrase" then no phrases would have been displayed because "buy shoes" does not occur as a subphrase.
As long as the window with the "Filter results" button remains open you can change the filter options and filter words (and then filter the results), without re-scanning the file(s), provided you keep the option regarding words-to-ignore. If you change that then you will need to click on the "Count phrases" button to re-scan the file.
Introduction User Manual: Contents Hermetic Systems Home Page