|


BASICS





All the standard rules of software production apply to writing HTML pages. In particular, remember
that the sooner you begin to code, the longer it will take, and that while getting something to run today is easy,
figuring out what it does two months down the road is rather difficult.
The design of HTML, which creates all the problems of modular construction without
conferring any of the benefits, doesn't help. The basic problem is that there is no way to make
symbolic definitions of macros, procedures, variables, and the like. As a result, dozens of files may have to
be updated whenever you make any changes.
The programs described here are all management tools for HTML files. They
have all been implemented as Kornshell scripts, and have been tested and
run using the MKS UNIX toolkit -- a set of UNIX tools implemented under DOS.
The code is slightly roundabout, in an attempt to avoid any cleverness whatsoever, and to work
under an OS that might crash at any moment.
They should (famous last words) run under any standard UNIX implmentation.
Problems they help solve include:
- Keeping the content of features (like the jump bar at the top and bottom of each page) consistent
across all the pages.
- Checking that local file references actually exist, and making it easy to deal with the ones that don't.
- Producing a list of remote links for testing.
- Producing a complete list of local files actually used by the .html pages (rather than all the outdated,
unused, and backup files that happen to be in the same directories).
- Producing a complete list of actually-used files modified since the last site archive was created (so
that you don't have to upload all those zips and gifs every time).
My web pages are developed on a standalone PC using HotDog and Netscape 1.22. All relevent files are collected
(as described below), then uploaded to a Web archive.
As far as possible, this software has NO FEATURES! Please make sure that you understand how it
works before you use it.


NAMING CONVENTIONS





A few simple conventions make maintenance much easier. First, all files are in subdirectories at the same
relative level:
www ... subdirectories only -- no files
/ | \
main font gif ... other directories
| | |
main.htm font.htm foo.gif ... other files
The top-level directory, www, contains only subdirectories. Eventually, when it is installed, it will contain
a "main.html" file that is linked to main/main.htm.
In the html files, every local reference is given as a full path name relative to the parent directory:
main.htm refers to a file in ../gif as ../gif/file.gif as expected.
main.htm refers to another file in its own directory as ../main/file.htm
Why do things this way? So that every local file reference has the exact same form, no matter
where it occurs. This makes it far, far easier to maintain the whole suite of pages.


FEATURE NAMING





Whenever possible, features that are repeated from page to page are named. If they are ever modified,
two tools distribute the new versions across the suite. The tools are:
- update.ksh collects the features from a 'model' file (usually main/main.htm), and creates
a revision file for each feature.
- revise.ksh performs various safety checks, then inserts the revisions, as appropriate, into each html file in www/*/*.htm
Features look like this:
<!--feature pageback version 26861-->
<BODY background="../gif/edgepr1.gif" bgcolor="ffffff">
<!--endfeature pageback -->
The number (26861) is inserted by the revision program itself.
Other named features include the page masthead and the list of links at the top and bottom of each page.


CHECKING AND ARCHIVING





The biggest headache with a large set of pages is making sure that all the links are actually there.
These programs both search all ../*/*.htm files (ie. every .htm file at the same level):
- gethttp.ksh collects external references of the form "http:",
"mailto:", "gopher:", and "ftp:" from all the HTML files, then stores both the references, and the files they appear
in, in a new file called
httplist.htm, where you can check the references at leisure.
- getref.ksh looks for local file references. It looks to see
if the files actually exist, then creates:
- ziplist.all -- a complete list of files that do exist,
- ziplist.not -- files that do not exist,
- ziplist.htm -- all files, together with the origin files of calls that do not exist,
- ziplist.new -- file that do exist AND have been modified since the
creation date of a file named "htm.zip" (and should be in the same directory).
These programs solve a variety of consistency and transport problems.
httplist.htm and
ziplist.htm
make it easy to do final checks on referenced files. File ZIPLIST.NOT usually contains
misspelled references, place-holders you forgot to get rid of, or path names you forgot to change.
ZIPLIST.ALL should be used the first time you archive the site: pkzip -p htm.zip @ziplist.all. This creates a
zipfile, complete with path names, of every file in the site.
Thereafter, ZIPLIST.NEW should be used as the final argument. It includes only files that have been modified
since the creation date of HTM.ZIP. ( This is slightly clunky, but was easiest to do with MKS Kornshell)
When you unpack the zip file, remembert that "pkunzip -d htm.zip" will
preserve path names and create directories as needed.
OVERVIEW |
CRCL |
CALLS |
DICTS |
FONTS |
SOFTWARE |
PAPERS |
PROJECTS |
WHO?... |
LISTS |
SPOKEN... |
REF CARDS |
SEEKING |
BASICS... |
CLOCKS |
HOW?... |
LOCAL |
CONTENTS...
All original work © 1995 Doug Cooper. Please see this
disclaimer, which takes responsibility for content, and the
release notice, which gives you the right to copy it.
We believe that all files referenced by these pages may be distributed for research / educational purposes.
If any file should not be distributed, please let us know and we will remove it.
|