Using Hermetic Word Frequency Counter
with Large Files and
Importing the Output into Excel
To confirm that Hermetic Word Frequency Counter works with files containing about 100,000 different words a set of about 100,000 random 'words' (strings of 3-15 random letters) was generated, and from this a file was created consisting of 'sentences' composed of these words. This file is 2025 KB in size; the ZIP file (1315 KB) containing it may be downloaded by clicking on this link.
The program was applied to this file (with all checkboxes at the 'Settings' window cleared). At the point where it had extracted 31,910 words the program looked like this:
After 33 minutes (on a 64-bit 2.10 GHz PC) the program had extracted a total of 99,877 different words. After a brief processing period the words found were displayed:
At this point the file 'output.txt' was renamed to 'output_freq.txt'. Then word order 'alphabetical' was selected, and the words were displayed alphabetically (with negligible wait time). A new 'output.txt' file was written, and this was renamed to 'output_alpha.txt'. Both files are contained in a ZIP file(1846 KB) which can be downloaded by clicking on this link.
Opening 'output_freq.txt' in Excel Starter 2010 brings up the Text Import Wizard, which has three steps:
Select 'Delimited' and make sure that "File origin" is set to "Windows (ANSI)", then click on 'Next'. Uncheck 'Tab' and check 'Space'. (Note: When doing this with the Advanced Version of this software you should leave 'Tab' checked; see Exporting Results to Excel.)
Then click on 'Finish'. Excel then displays the output data in a spreadsheet. Widen column D to see the words.
Excel 2003 has a limit of 65,536 rows; for 100,000 rows use Excel 2007 or later.
Introduction User Manual: Contents Hermetic Systems Home Page