Re: Embedding into HTML [was signature/encryption tags]

Craig Hubley (craig@passport.ca)
Wed, 12 Apr 95 18:23:40 EDT

> One of the main things I am concerned with at this point is the issue of
> actually embedding a program in a document, which seems to be closely
> related to these comments about signatures:

Yes. As is directly embedding any kind of information which is not intended
for human consumption but for machine use (a filename is about the only real
crossover I can think of). All of this, to quote myself:
> > is information that is
> > - not directly human readable (i.e. must be generated/read by a program)
> > - potentially quite large (i.e. with a large key it could be 100s of bytes
> ,
> > and generally will expand as key sizes expand to potentially several K)
> > - not of interest to all readers (i.e. as graphics are often 'turned off'
> > by many readers, authentication/signatures are often of interest only
> > to those who intend to act directly on the information in some binding
> > way)
> > - not directly useful in processing other information on/in the page

My intent was to invoke the (implied) SGML principle of human-readable and
human-editable information being the only legal content of an SGML document.
By having no mandatory binary or otherwise non-ASCII-encoded information it
has been possible for ASCII editors to generate SGML documents with only a
template. A major factor in its success.

> Is this a good general set of rules to go by for what should and
> shouldn't be included directly in an HTML document? My current

We should discuss what the SGML rules are first... can we find SGML DTDs
(other than HTML) that include tags that have:
- non-human-readable data ? (e.g. UUencoding) that can't be altered by humans
- machine-generated data? (e.g. object attributes) that might be edited by humans
- commands ? (e.g. embedded programs) note that these are human readable
and writable, and that a GIF file can be considered a 'program' or 'command'
to show a picture in the trivial sense.

I can't recall seeing any. If not there is probably a reason why not. I'd
like to ask some SGML gurus about the issues involved in tags that must be
verified by a specialized program to see if they are legal or well-formed.

> implementation uses a structure in which it adds a new major section at the
> <HEAD> and <BODY> level, called the <INTERFACE>. This section is
> currently defined as:

We built something similar as part of a revision control system some years
ago. Worked fine, generated C++ headers, body files, publishable headers.
Not using SGML though, we tried to keep it looking like C++ comments so
that the raw files would still compile (they did). Doing it today I'd
do it in SGML.

> So, it contains zero or more <MODULE> tags, which declare all the external
> modules needed by the program, and <SOURCE>, whose content is the actual
> script which will be executed. [A bit more detailed description of this is

An SGML-based 'make' replacement ? Intriguing... that makes it possible to
have SGML-based 'rcs' replacements too... very intriguing.

> at <http://www.cs.orst.edu/~hackbod/exechtml/>, which is anextremely
> preliminary design, but it's something. :)]

Does it refer to other work in this area ?

> an extern URL like the modules are. But is it -absolutely- a bad thing to
> make it possible to have an embedded program? It seems that this case

I think embedded programs of a few lines make sense in many cases, if they
are human-editable I don't see any reason why they couldn't be SGML tags.
An HTML anchor is one such. SGML HyTime may specify some guidelines here
as that committee was pretty thorough about active and time-based data...

> wouldn't fall into the last two categories listed about -- it most likely
> is of interest to all browsers, and it is directly useful for processing
> the other information in the page.

It is only of interest if browsers *must* execute the code to get the
intent of the page across... for instance if it specifies transformations
on the rest of the page that make them sensible (e.g. decrypting graphics).

If all the program does is wave 'hi Mom!' then who cares, it's dispensible
as graphics often are. But there is no standard tag to tell the browser
that the information in the graphics (or other embedded information) is or
is not duplicated by the text-based tags... this might be useful, to know
if the availability of graphics/signature/program processing is or is not
critical to the understanding of the page. I see no way for programs to
guess. Some authors will abuse it and 'insist' that no one see their page
unless they are running the required MPEG viewer so they can wave 'hi Mom'
but that is their problem...

So should this 'add-on viewers are absolutely essential to getting the point
of this page' be considered as a basic HTML tag ? Opening it for dicussion...
Also still looking for feedback on whether SGML DTDs can provide precedent
for inclusion of 'active data' such as programs or signatures, that require
certain software to be processed, directly in a tag rather than a MIME file.

Some postings recently seemed to indicate 'no, leave it in MIME', which is
a position that I generally support. In any case I do not support any HTML
solution that is not compatible with existing standards and practice in SGML
so I would want to review those precedents first.

-- 
Craig Hubley                Business that runs on knowledge
Craig Hubley & Associates   needs software that runs on the net
craig@passport.ca     416-778-6136    416-778-1965 FAX
Seventy Eaton Avenue, Toronto, Ontario, Canada M4J 2Z5