Non-English Text

Use of the program to count different words in non-English text

Hermetic Word Frequency Counter (and the Advanced Version) may be used with text in some (but not all) languages other than English, including German, French, Italian, Spanish and Portuguese — in fact, any language whose characters can be encoded using ISO 8859-1, a subsetof Windows 1252. (For more details see Scannable Files and Languages Supported.) Some European languages (such as Polish and Czech) and all non-European languages (such as Arabic and Hebrew) are not supported.

This shows the result of counting words in a 1.04 Mb HTML page in French (done in 9 seconds):

Hermetic Word Frequency Counter counting words in French text

Here are examples of the output when using German text (the words are ordered alphabetically) and Portuguese text (the words are also ordered alphabetically):

The option for dropping a final 's' unless it is preceded by an 's' or a vowel is intended to allow the conflation of single and plural nouns in English (e.g., 'dog' and 'dogs'). This option also helps to conflate German nouns with their genitives, e.g., 'Bewußtsein' and 'Bewußtseins'. But this option may have unintended consequences, so it is better to leave it unchecked unless results of a scan suggest that it should be used.

Here are examples of the output when using Italian text (the words are ordered by frequency) and Spanish text (the words are also ordered by frequency):

