How to create an HTML document using txt2html

How to create an HTML document using txt2html

D.R. Lorimer

How txt2html works

txt2html is a Tcl script which converts simple ASCII text files into HTML documents. The name of the output HTML file is based on the stem of the ASCII input file. For example,

txt2html index.txt

would produce an html file called index.html as output.

Modelled on LaTeX, txt2html allows the user to break the documents into sections and produce lists. Image files in GIF and JPEG formats can be easily incorporated as figures. To produce the HTML output tex2html parses your ASCII text file looking for control statements which define sections, lists etc. Control statements are produced by starting a line with the % character followed by a keyword. For example,

%section How txt2html works

was used to define the start of this section. The full range of keywords is described in the following sections. Note that the % character is ONLY recognized by the parser if it is in column 1 of a line!

Setting the title and author's name

This is done using the %title and %author keywords. For example,

%title How to create an HTML document
%author D.R. Lorimer

was used at the top of this file.

Creating sections

As in LaTeX, sections and subsections can be defined. txt2html then uses these definitions to automatically link them to the table of contents. Three levels are possible

%section Name of a section
%subsection Name of a subsection
%subsubsection Name of a subsubsection

Creating lists

The item keyword is used to make lists. For example the following shopping list:

  • 3 eggs
  • pint of milk
  • 1/2 pound of cheese

    was created as follows:

    %item 3 eggs
    %item pint of milk
    %1/2 pound of cheese

    Placing images

    Images in GIF/JPEG format can be placed into the text using the image keyword. The syntax for this keyword is:

    %image filename (alignment) (width) (height)

    where filename is the name of the image file. Additional optional arguments control the alignment and size of the image. Alignment is either left, right, centre or center and width and height are the image dimensions in pixels. As an example, the following image:

    was placed using the syntax:

    %image mouse.gif centre

    Adding links

    Links to other URLs may be added using the link keyword which has the following syntax:

    %link url text inside link

    so for example, this link to the Jodrell home page was created using the line:

    %link http://www.jb.man.ac.uk this link to the Jodrell home page

    New paragraphs and line breaks

    A new paragraph is signified by a blank line like this...

    Forcing a line break like this
    can be done using the newline keyword.

    Tables

    Producing simple tables using txt2html is possible using the table keyword to define the start and end of the table, and the row keyword to define each row of entries using the | character to divide each column. For example, the table:

    Average height Average weight Red eyes
    Males 1.9 0.003 0.4
    Females 1.7 0.002 0.2

    was produced as follows:

    %table
    %row | Average height | Average weight | Red eyes
    %row Males | 1.9 | 0.003 | 0.4
    %row Females| 1.7 | 0.002 | 0.2
    %table

    Mathematical Equations

    There are two features in txt2html which can produce mathematical symbols and equations. The system used is the standard LaTeX mode surrounded by %equation delimeters. For example, the following equation




    was produced simply as follows:

    %equation
    E = m c^2
    %equation

    as txt2html parses your file, it will extract the latex source and generate a gif file from it. This is then automatically placed as an image in your resulting html file.

    Producing expressions within text is also possible using the %math command. For example the following sentence:

    Einstein's famous equation relates matter to energy.

    was produced as follows:

    Einstein's famous equation
    %math E = m c^2 %math
    relates matter to energy.

    Note that the alignment is currently not quite right. Any suggestions as to how to fix this will are most welcome!

    Including LaTeX definitions in txt2html

    Often, when using LaTeX, it is desirable to have a seperate file with shorthand definitions that are used in formulae. Such definitions can be defined in html by supplying them in a file and entering the name of the file on the command line. For example

    txt2html file.txt extras.tex

    will include extras.tex in all mathematical symbols produced by LaTeX.

    Omitting LaTeX

    If you get all your equations in place and run txt2html, there is no need to keep generating the gif files if you just want to edit some other part of the text. Since this is also time consuming, there is a command-line option in txt2html which will not produce any latex output. For instance, now that I have produced the expressions for the above discussion, I need only do

    txt2html index.txt nolatex

    to regenerate this document.

    Closing Remarks

    txt2html is currently a rather simple script - less than 170 lines long at present. It could easily be expanded to include other html commands such as a frame-based system for displaying the table of contents alongside the text rather than at the top. Please let me know any comments or suggestions for such additions and I'll try to incorporate them.