This whole issue is not clear to me at all.
I aim to specify, in the HTML 2.0 document, with SGML as a normative
reference, a language in the formal sense of the word; that is, a set
of strings over some set of symbols.
[The issue of the set of terminal symbols is hairy all by itself, but
for the sake of argument, let's fix the terminal symbols, or alphabet
at the ISO-8859-1 character repertoire.]
So I stick some public text in the specification -- some "SGML code,"
if you will. Two questions come up: (1) what is the language specified
by that public text, and (2) if I want my language to be a strict
subset of that language, under what circumstances do I still have a
conforming SGML application?
For example, here's a conforming SGML document that is not in the
language that I intend to specify:
<!doctype input public "-//IETF//DTD HTML 2.0//EN">
<input>
So I wrote this in the HTML 2.0 spec:
|To identify information as an HTML document conforming to this
|specification, each document should start with the prologue:
|
|<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
|
|(11)
|
|If the body of a text/html message entity does not begin with a
|document |type declaration, an HTML user agent should infer the above
|document |type declaration.
|
|HTML user agents are required to support the above document type
|declaration, the following document type declarations, and no others.
|
|<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0 Strict//EN">
|<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML Strict//EN">
|
|In particular, they may support other formal public identifiers, or
|document types altogether. They may support an internal declaration
|subset with supplemental entity, element, and other markup
|declarations, or they may not.
The idea is that the HTML language is specified as those conforming
SGML documents whose prologue is one of the above, given that the
FPI's resolve to the public text given in the spec.
On the other hand, if we're not allowed to have application
conventions that prohibit marked sections, then how can we prohibit
internal declaration subsets? In fact, how can we prohibit documents
that define a vastly different grammar by redefining parameter
entities and declaring new element types? Can a conforming SGML
application even specify the document element?
Clues?
Dan