CheckHtmlEsis is a Java program that checks the output from James Clark’s nsgmls program and checks to see if the attributes comply with the HTML 4.0 specifications. Additionally it can check to see if http URI’s referenced by the document exist.
CheckHtmlEsis 1.0.2 is available to download in zip file format.
To use CheckHtmlEsis, you need to have JDK 1.1 and nsgmls installed and running on your system. CheckHtmlEsis may work with JDK 1.0 but I have not tested it. CheckHtmlEsis is run from the command line.
Note to Mac users: I have no idea how this would run on a Mac. Feedback is welcome.
CheckHtmlEsis takes the output from nsgmls from Standard Input. I usually run it with a command line such as:
Error messages will be written to Standard Error. If the -t option is given, Text will be written to Standard Out.
How you run CheckHtmlEsis will vary based on where you have your HTML declaration, where you have Java installed, where you install CheckHtmlEsis, and what OS you are running. Consult your local documentation for nsgmls, Java, and your OS for more information. You may find it useful to wrap this program in a batch file or shell script.
nsgmls should be run with the -l option so that CheckHtmlEsis can get access to the line numbers of the original document. This will help you locate your errors in your document.
With the -t option CheckHtmlEsis will output the text in the document. This can be used with programs like ispell. For example:
CheckHtmlEsis take a few command line options.
When CheckHtmlEsis dumps text to Standard Out, element data is be split up
at the ends of the elements. For example M<SUP>lle</SUP>
will be output as:
M lle
Flags “news” URIs as errors. Probably flags other less common URIs as errors as well.
The “ARCHIVE” attribute of the “OBJECT” should be a space separated list of URIs. CheckHtmlEsis will incorrectly flag an error if it contains more than one URI.
The source code is a bit messy, and has very little documentation.
CheckHtmlEsis uses data resources to check attribute types like %LanguageCode. These data files can be recompiled by the user if e wants to update new additions. This process needs to be documented. Ambitious people can look through the code. Look at the ``Main'' routines for CheckCharsetAttribute, CheckContentTypeAttribute, CheckLanguageCodeAttribute, CheckLinkTypesAttribute, and CheckMediaDescAttribute classes.
I want to check ``Context Sensitive'' errors in the document. Such errors as:
<OL> <LI TYPE=disc>List Item </OL>and
<P> <INS><DIV>...block-level content...</DIV></INS> </P>
THIS SOFTWARE IS PUBLIC DOMAIN AND IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHOR OR ANYONE ELSE BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.