Text Search and Count

Text Search and Count

Hermetic Word Frequency Counter

Click to enlarge
This software scans an MS Word docx file or a text file (including HTML and XML files) with text encoded via ANSI or UTF-8 and counts the frequencies of different words. The words which are found and displayed can be ordered alphabetically or by frequency. Characters which can appear in words can be specified, so the program can be told to allow or disallow words with numerals, hyphens, apostrophes, underscores or colons, to ignore words which are short or which occur infrequently, to treat upper/lower case as significant or not, and to ignore words (e.g., common words such as 'this') contained in a specified file. This software may be used with text in languages other than English, in particular, with French, German, Italian and Spanish text. Language of the text is detected automatically and the corresponding 'common words' file (with words to be ignored) is optionally loaded. Results can be written to an output file and can then be read into Excel for further processing.

Hermetic Word Frequency Counter Advanced Version

Click to enlarge
This software scans one or more docx, text or text-like files (e.g. HTML and XML files) and counts the number of occurrences of the different words or phrases (pattern matching can be used). There is no limit on the size of an input text file. The words/phrases which are found can be displayed alphabetically or by frequency. The program can be told to allow or disallow words with numerals, hyphens, apostrophes, underscores or colons, to ignore words which are short or which occur infrequently, and to ignore words (e.g., common words such as 'the') contained in a specified file. It can also be told to count only words or phrases in a specified list. Results can be written to an output file, and that file can be read into a spreadsheet such as Excel. This program can automatically create an Excel-readable file of words/phrases vs files. It may be applied to text in French, German, Italian, Spanish and other European languages. The program supports the use of regular expressions for specifying words/phrases to count.

Phrase Frequency Counter Advanced

Click to enlarge
This software scans one or more docx, text or text-like files (e.g. HTML and XML files) and counts the number of occurrences of the different phrases. There is no limit on the size of an input text file. The phrases which are found can be displayed alphabetically or by frequency. The program can be told to allow or disallow words with numerals, hyphens, apostrophes, underscores or colons, and to ignore words (e.g., common words such as 'the') contained in a specified file. Results can be written to an output file which can be loaded into a spreadsheet program such as Excel or Libre Office Calc. This program works with text in French, German, Italian, Spanish and other European languages.

Search KWIC Concordance

Click to enlarge
This is a Windows program for generating and searching a KWIC concordance of a document ("KWIC" = "Keywords in Context"). A KWIC concordance is a list of the different words occurring in the document, with each instance of each word shown in context (that is, within a phrase). Word frequency is shown. Context size is user-definable, anything from 3 to 19 words long. The software acts on text files and on MS Word docx files, skipping over "stop" words. The concordance can be displayed alphabetically or by frequency, and can be written to a file. After a concordance is generated it may be searched for specified keywords. You can search also for word patterns (such as "b?yn*w"). You can search for a specified word (or word pattern) which has another specified word (or word pattern) close to it (that is, within the same context). There is no limit on the size of an input file. You can tell the program to allow or disallow hyphenated words or words with numerals. You can tell it to include only words which occur with more than a specified frequency. Stop words may be read from a file. This software may be used with text in languages other than English, in particular, with French, German, Italian, Spanish and Latin text.

Hermetic Systems Home Page