A Customizable Multiple-File Word and Phrase Count Program for Windows
There are two versions of this software: Hermetic Word Frequency Counter (WFC) and Hermetic Word Frequency Counter Advanced Version (WFCA). These are two separate programs. The main difference is that WFC counts all words only in single text and text-like files (including HTML and XML files, but not PDF or MS Word doc files), whereas WFCA counts all words or phrases in multiple files in a single operation. If you need to count words in only one file at a time then WFC is what you need. (Click on this link for the WFC page.) If you have many files or wish to count phrases or need more options and functionality, then you need WFCA (so read on). More details are given below in Differences [of WFCA] from the Basic Version.
The Advanced Version
This software scans a text file (an ANSI text file, an HTML file, an XML file, etc.), multiple such text files, or text on the clipboard, and counts the number of occurrences of the different words and phrases (optionally ignoring common words such as the and this or words matching specified patterns). As well as being able to count all words and phrases, it can also count the number of occurrences of specified words and phrases in a given list (optionally matching specified patterns). The words or phrases which are found can be listed alphabetically, reverse alphabetically or by frequency, with rank and frequency displayed for each word or phrase. The results can be written to a file which can be read into Excel for further processing.
The Advanced Version does everything that the basic version does, including support for UTF-8 encoded text. The section below details the additional functionality of the Advanced Version, mainly, the ability to count words in multiple files, the ability to count phrases as well as words, and the ability to count occurrences of a word or a phrase which matches a specified pattern (including regular expressions). Thus the user manual for the basic version should be read before (or after) reading this page (but note that the appearance of the main window and of the 'Settings' window differ somewhat in the two versions).
This software counts words and phrases in ANSI text and text-like files (including HTML and XML files). It does not act directly on MS-Word .doc files, which are binary files; such files can be scanned if saved as "Plain Text" files (see Scannable Files in the user manual for the basic version).
Below is a sample screenshot when all words in 15 HTML files in a folder containing pages about the Chinese calendar downloaded from the web are counted:
Here is another sample screenshot when phrases (in the same set of files) are counted.
Differences from the Basic Version
The following are some (but not all) features of the Advanced Version (WFCA) which are not present in the basic version (WFC):
The ability to:
- count all phrases (not just all words) in a file.
- scan not just one file but all files in a folder, and optionally in all subfolders, and to return a single report on the frequencies of words and phrases in all files scanned.
- specify not only a list of words to be ignored (such as common words in a natural language) but also specify a list of words and phrases which are to be counted.
- count a word or phrase matching a given pattern (e.g. st~, p?p).
- ignore a word matching a given pattern (e.g. ~ing).
- display words and phrases counted in reverse alphabetical order as well as in alphabetical order and by frequency.
- display, for each word or phrase found when scanning multiple files, the files in which it occurs, and how many times.
- include or to exclude files of certain types (identified by file extension).
- generate an Excel-readable file containing a table of frequencies of words and phrases vs the files in which they occur.
- switch between multiple files containing words to ignore.
- switch between multiple files containing specified words/phrases to count.
- generate data which can be used to test whether a corpus of text conforms to Zipf's Law.
Both the main screen and the 'Settings' screen differ somewhat from those in the basic version, although all the functions of the basic version are retained. Here is what the 'Settings' screen looks like in the Advanced Version:
See Setting the Operation Parameters (in the user manual for the basic version) for further information.
As stated above, the Advanced Version does (almost) everything that the basic version does, so the following sections of the user manual for the basic version apply also to the Advanced Version.
Trial version: A copy of the Hermetic Word Frequency Counter Advanced Version installation program can be downloaded for the purpose of evaluation. Click on the following link for further information:
Download Hermetic Word Frequency Counter Advanced ...
Price and ordering: A single-user license for the fully-functional software is available for a period of 3 months, 1 year or with no time limit (a 'perpetual' license). Prices for each type of license are given at Purchase a User License. (A multiple-user license is available for this program.) An activation key is required in order to make the trial version permanently fully functional. An activation key can be obtained immediately if you purchase a user license either via PayPal or via Share-it.
Refund: A refund will be provided promptly up to 30 days after purchase if the software does not perform satisfactorily.
Updates: Purchasers of a user license for this software are entitled to an update to any later version at no additional cost.
Upgrading from the non-Advanced version:
Purchasers of a perpetual user license for Hermetic Word Frequency Counter may upgrade to a perpetual license for the Advanced Version by paying
$19.95, €14.95 or £12.75 (excluding any sales tax). To purchase the upgrade click on one of the links below. Note that this is available only if a perpetual single-user license for Hermetic Word Frequency Counter has already been purchased.