Text to HTML
 

This utility is for converting text in a textfile to a HTML document. This panel initially appears as follows:

Sometimes you may have a text file (obtained e.g. from a mailing list) which you wish to publish on your website. A quick way of converting a text file to a HTML document, of course, is to add a wrapper of the form:


<html>
<body><pre>
The text goes here.
</pre></body></html>

Nevertheless there are several problems remaining:

The Text to HTML tool of Easy HTML Construction Kit automates most of the tasks involved in converting a text file to a decent-looking HTML document. Sometimes some editing will be required to get the HTML document into its final form, but this should be able to do most of the work (five examples are given below).


Features of Text to HTML

  1. Text to HTML automatically converts the following:

    From: < > " & ¢ £ ¥ © « » ¿
    To: &lt; &gt; &quot; &amp; &cent &pound &yen &euro &copy; &laquo; &raquo; &iquest;


  2. Language-specific characters (those in the table below) are converted to the corresponding Latin-1 (ISO-8859-1) entities: &Agrave, &Aacute; and so on.

    À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß  
    à á â ã ä å æ ç è é ê ë ì í î ð ñ ò ó ô õ ö ø ù ú û ü ý þ ÿ  


  3. Text to HTML automatically converts http:// and ftp:// references to active links. For example, if a document contains

    http://tcho.usno.gov:80/~cal/gps_week.htm

    then the HTML document will contain

    <a href="http://tcho.usno.gov:80/~cal/gps_week.htm">
    http://tcho.usno.gov:80/~cal/gps_week.htm</a>

    and it will appear as

    http://tcho.usno.gov:80/~cal/gps_week.htm

    URLs to be converted in this way may contain query signs, commas and semicolons, e.g.:

    http://disc.serve.com/discussion.cgi?disc=149495;article=45885,title=NFPA


  4. Text to HTML automatically converts email addresses to active links. For example, if a document contains

    joe@eazenet.com

    then the HTML document will contain

    <a href="mailto:joe@eazenet.com">joe@eazenet.com</a>

    and it will appear as

    joe@eazenet.com


  5. If you want some lines in the text to be centered headings, i.e., placed in <hn align=center>...</hn> tags, then insert \h1, \h2, ..., \h6 at the start of the line. For example, \h3This is a Heading will be converted to

    <h3 align="center">This is a Heading</h3>

    and will appear as

    This is a Heading


  6. Text to HTML automatically strips out unwanted email header fields such as X-sender: A line is a field if the first word ends with a colon. A field will be stripped out if the field name contains a dash (e.g., Return-to-address:). This leaves fields entitled Date:, From:, To:, Subject:, etc. Fields which are included in the output file always begin on a new line.


  7. If text is quoted within the text file using ">", as is usual in email messages, then each quoted line will appear in the HTML document on a new line, to preserve the original appearance of the quoted text.


  8. If a line ends with a space plus an 'equals' sign, '=', the terminal '=' is removed. (Some mailer programs add '=' whenever a line break is inserted in uploaded text.)


  9. In order to preserve formatting and spacing Text to HTML supports a little "pre-HTML":

    • Any occurrence of <BR> or <br> in the text file will be kept, and thus cause a line break in the displayed text.

    • Any occurrence of <P> or <p> in the text file will be kept, and thus cause a paragraph break (with a blank line) in the displayed text.

    • Any occurrence of <HR> or <hr> in the text file will be retained (as <HR>).

    • Any occurrence of <PRE> or </PRE> (or of <pre> or </pre>) in the text file will be kept, thus preserving the formatting, including the line breaks, present in the text file between <pre> and </pre>.

    When using Text to HTML on mailing list messages it is advisable to enclose signature lines within <pre> ... </pre> to preserve line breaks.

    The headers of an email message should not be enclosed within <pre> ... </pre>, since Text to HTML is designed to deal with headers automatically.


  10. Outside of a <pre> ... </pre> section, if a line ends with a hyphen, '-', the terminal '-' is removed and no line break is written to the file, with the intention that no word break occurs in the displayed text.


  11. When preparing a text file to be converted to HTML you can divide sections of text (outside of <pre>...</pre>) by using a line of at least 10 dashes. You can, optionally, specify the width of the dividing lines by appending a percentage, as in:

    ----------------20%
    ----------30%
    ------------40%
    and so on.


Example of Use

The file containing the text to be converted to HTML must be specified by means of a full pathname. This can be entered manually into the first text box or it can be created using the dialog box produced by clicking on Specify input file. Then the output file (the HTML document) must be specified (in the same way).

Custom header and footer files and text and background colors may be specified as explained on the HTML Template page.

In this example we need to add one <pre>...</pre> to the sig to preserve the line breaks (to produce the file EX1T.TXT):

<pre>--------------------------------------------------------------------------
This list is public. To join fight-censorship-announce, send
"subscribe fight-censorship-announce" to majordomo@vorlon.mit.edu.
More information is at http://www.eff.org/~declan/fc/</pre>

Having specified text and background colors, the input setup for this example is as follows:

The result of clicking on

is then EX1T.HTM.


Further Examples

In the following examples (mostly from mailing lists) the file in the first column is the original text file; the file in the second column is the result of using Text to HTML on this; the file in the third column is the original file with a little pre-HTML added; the file in the fourth column is the result of using Text to HTML on this.
Original
text
Original text using
Text to HTML
    Original with
pre-HTML
Original with pre-HTML
using Text to HTML
EX2.TXT EX2.HTM     EX2T.TXT EX2T.HTM
EX3.TXT EX3.HTM     EX3T.TXT EX3T.HTM
EX4.TXT EX4.HTM     EX4T.TXT EX4T.HTM
EX5.TXT EX5.HTM     EX5T.TXT EX5T.HTM

The HTML source code in these examples has tags in upper case. The software was changed in Version 8.01 to generate HTML tags in lower case.


Files larger than 40 KB

When converting a text file to an HTML document there is no limit on the input file size, but input and output files can be displayed by the program only for files of not more than 40 KB in size. An input file which is larger than this will be displayed automatically, by clicking on View input file, in whatever program is associated with files with txt extension.

If you convert, e.g., a 1037 KB text file Lunecl.txt to the HTML document Lunecl.htm then first the usual How to view? window will appear:

If you click on As HTML code the HTML file will not be displayed in a window (as is done for output files not larger than 40 KB) but rather a message box will appear stating that the file is too large:

If instead you click on In web browser the HTML file will be displayed in your web browser regardless of the size of the file (assuming there is sufficient memory available).

Easy HTML Construction Kit Home Page