
   #[1]Russell O'Connor [2]Public Domain
   
                              CheckHtmlEsis 1.0.2
                                       
   CheckHtmlEsis is a [3]Java program that checks the output from
   [4]James Clark's [5]nsgmls program and checks to see if the attributes
   comply with the [6]HTML 4.0 specifications. Additionally it can check
   to see if http [7]URI's referenced by the document exist.
   
How to get CheckHtmlEsis

   CheckHtmlEsis 1.0.2 is available to [8]download in zip file format.
   
How do I use CheckHtmlEsis

  System Requirements
  
   To use CheckHtmlEsis, you need to have [9]JDK 1.1 and [10]nsgmls
   installed and running on your system. CheckHtmlEsis may work with JDK
   1.0 but I have not tested it. CheckHtmlEsis is run from the command
   line.
   
   Note to Mac users: I have no idea how this would run on a Mac.
   [11]Feedback is welcome.
   
  Running CheckHtmlEsis
  
   CheckHtmlEsis takes the output from [12]nsgmls from Standard Input. I
   usually run it with a command line such as:
   
   nsgmls -l \sgml\decl\HTML4.decl index.html | java -classpath
   e:\java11\lib\classes.zip;f:\CheckHtmlEsis\classes.zip
   oconnor.russell.html.CheckHtmlEsis -l -b
   http://www.undergrad.math.uwaterloo.ca/%7Eroconnor/
   
   Error messages will be written to Standard Error. If the -t option is
   given, Text will be written to Standard Out.
   
   How you run CheckHtmlEsis will vary based on where you have your HTML
   declaration, where you have [13]Java installed, where you install
   CheckHtmlEsis, and what OS you are running. Consult your local
   documentation for nsgmls, [14]Java, and your OS for more information.
   You may find it useful to wrap this program in a batch file or shell
   script.
   
   [15]nsgmls should be run with the -l option so that CheckHtmlEsis can
   get access to the line numbers of the original document. This will
   help you locate your errors in your document.
   
   With the -t option CheckHtmlEsis will output the text in the document.
   This can be used with programs like ispell. For example:
   
   nsgmls -l index.html | java -classpath
   /usr/local/java/lib/classes.zip:/u/roconnor/classes.zip
   oconnor.russell.html.CheckHtmlEsis -t | ispell -l | sort -u >
   index.html.sperr
   
  Options
  
   CheckHtmlEsis take a few command line options.
   
   -l
          Turn on link checking. With this option CheckHtmlEsis will make
          HTTP connections to see if linked resources are actually
          available, or have moved.
          
   -b <base URI>
          Resolves relative links to using <base URI> as a base. This is
          only important if you use the -l option.
          
   -w
          Suppress warnings. Only Error messages will be written to
          Standard Error, no warnings will be written.
          
   -t
          Write all text to Standard Out. Any element character data will
          be written to standard out. Element data for the SCRIPT and
          STYLE elements won't be written. I do not consider that data to
          be text. The data from attributes with text will also be
          written.
          
   -h or -?
          Displays help.
          
BUGS

   When CheckHtmlEsis dumps text to Standard Out, element data is be
   split up at the ends of the elements. For example M<SUP>lle</SUP> will
   be output as:
M
lle

   Flags "news" URIs as errors. Probably flags other less common URIs as
   errors as well.
   
   The "ARCHIVE" attribute of the "OBJECT" should be a space separated
   list of URIs. CheckHtmlEsis will incorrectly flag an error if it
   contains more than one URI.
   
   The source code is a bit messy, and has very little documentation.
   
   CheckHtmlEsis uses data resources to check attribute types like
   %LanguageCode. These data files can be recompiled by the user if e
   wants to update new additions. This process needs to be documented.
   Ambitious people can look through the code. Look at the ``Main''
   routines for CheckCharsetAttribute, CheckContentTypeAttribute,
   CheckLanguageCodeAttribute, CheckLinkTypesAttribute, and
   CheckMediaDescAttribute classes.
   
Future Development

   I want to check ``Context Sensitive'' errors in the document. Such
   errors as:
   
<OL>
<LI TYPE=disc>List Item
</OL>

   and
<P>
<INS><DIV>...block-level content...</DIV></INS>
</P>

Legal Stuff

   This program is [16]public domain.
   
   This program is distributed in the hope that it will be useful, but
   WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
     _________________________________________________________________
   
   
    [17]Russell O'Connor: [18]roconnor@math.berkeley.edu

References

   1. http://www.math.berkeley.edu/~roconnor/
   2. http://www.math.berkeley.edu/~roconnor/publicDomain.html
   3. http://java.sun.com/
   4. http://www.jclark.com/
   5. http://www.jclark.com/sp/
   6. http://www.w3.org/TR/REC-html40/
   7. http://www.ietf.org/rfc/rfc2396.txt
   8. http://math.berkeley.edu/~roconnor/CheckHtmlEsis/CheckHtmlEsis.zip
   9. http://java.sun.com/products/index.html
  10. http://www.jclark.com/sp/
  11. mailto:roconnor@math.berkeley.edu
  12. http://www.jclark.com/sp/
  13. http://java.sun.com/
  14. http://java.sun.com/
  15. http://www.jclark.com/sp/
  16. http://www.math.berkeley.edu/~roconnor/publicDomain.html
  17. http://www.math.berkeley.edu/~roconnor/
  18. mailto:roconnor@math.berkeley.edu
