Re: proposed registration of type 'text/html' for MIME

Dan Connolly <connolly@pixel.convex.com>
Message-id: <9211110050.AA13695@pixel.convex.com>
To: Edward Vielmetti <emv@msen.com>
Cc: www-talk@nxoc01.cern.ch
Subject: Re: proposed registration of type 'text/html' for MIME 
In-reply-to: Your message of "Tue, 10 Nov 92 18:58:17 EST."
             <m0mp5Tc-00009TC@garnet.msen.com> 
Date: Tue, 10 Nov 92 18:50:01 CST
From: Dan Connolly <connolly@pixel.convex.com>

>Thanks for the message, Dan.  A few points.
>
>I am not comfortable referencing documents (in an IETF message) that
>are available only via the system in which I'm trying to document.
>I.e. for the purpose of conveying to the IETF what all we're up to
>it would be best to have files in the anonymous FTP area and rendered
>in ASCII.  

Point taken. But we can certainly come up with an ASCII version of
http://info.cern.ch/hypertext/WWW/MarkUp/MarkUp.html . There's no
need to use the HTTP document.

And the HTML DTD is a plain ASCII document as is. I'm not sure
if it's available via ftp, but certainly that's not an insurmountable
obstacle.

>Calling HTML an "SGML application" is not a bad long term plan.  I
>fear there's some risk in ease of implementation from
>	Content-type: text/sgml; dtd="(string that identifies html.dtd)"
>compared to
>	Content-type: text/html
>and as such I'd prefer to not haul in all of the SGML standard in the
>description of the system, not right up front at least.  Better to
>spec something that you can deliver and play with rather than stretch
>things out to their limits.  

Uuugh! Do I have to write a "Misconceptions about SGML" essay? I
never said anything about content-type: text/sgml. I did talk
about hauling the SGML standard in, but that only requires the few
changes I pointed out. There's no need to implement a whole SGML parser.

But I'd say ISO 8879 + html.dtd is a better spec for the syntax of
HTML than any english description we can come up with in the near term.
And the existing WWW code works just fine on conforming documents. [It
also groks non-conforming documents, but I don't see any crime in that.]

After all, I think this is the intent of the designers of HTML:

	HTML is not an alternative to SGML, it is a particular
	format within the SGML rules (an SGML "DTD"). [http.txt]


And, if we start to enforce SGML compliance, we may be able to do things
like using SGML editors, translators, browsers, etc. If we don't enforce
compliance, we might as well not use SGML at all!

>Dan, if 
>	http://info.cern.ch/hypertext/WWW/MarkUp/HTML.dtd
>is in fact something that should get a "public text identifier" (some
>kind of ISBN number?) then we should do it.  That would be a very
>useful document to reference in the comments section.

From what I can tell, there are three kinds of public text identifiers:
ISO ids, registered owner ids, and unregistered owner ids. The first
kind refer to ISO documents. The second category is for documents by
ISO members, I guess. And as far as I can tell, the third category
is anybody's game (kinda like x- identifiers.)

They look like this:

-//unregistered owner//class name//lang

The class is DTD. The lang is probably EN (english). The owner and
name are pretty much up for grabs. Usually the country is included
in the owner, even for unregistered owners.

So here's a stab:

-//USA-IANA//DTD HTML//EN

or, in context:

<!DOCTYPE HTML PUBLIC "-//USA-IANA//DTD HTML//EN">

But perhaps CERN, rather than IANA should be the owner.

Dan