Hermetic Word Frequency Counter
What is a Word?

The term 'word' usually means a word in a natural language such as English or German, but for this software it has an extended meaning. A word is a sequence of characters bounded by spaces, but it is necessary to specify which characters exactly are admissible in words.

The following characters are not admissible in words: plus signs (+), semicolons (;), double quotes (") and left and right angle-brackets (<>). (And, of course, a word cannot include a space.)

A word may begin or end with any alphabetic character and with any admissible non-alphabetic character (if such a character is allowed in the Set Parameters window) except for a hyphen, an apostrophe, a period or a colon (and, in the Advanced Version, except for a comma or a parenthesis).

Periods and @-signs may (if allowed in the Set Parameters window) occur within a word, thus enabling the counting of email addresses. Allowing colons, forward slashes, hyphens, underscores and periods in a word enables the counting of URLs.

The fact that the Advanced Version allows words with commas and parentheses means that chemical compounds can be treated as words, e.g.: 2,5-dimethoxy-4-(N-propyl-thio)benzaldehyde. (For more details on this possibility see here.)

Introduction User Manual: Contents
Hermetic Systems Home Page