Re: CAPTION in FIG element

Joe English (joe@trystero.art.com)
Thu, 16 Mar 1995 14:10:58 PST

lilley <lilley@afs.mcc.ac.uk> wrote:

> Looking at the DTD (Draft: Mon 13-Mar-95 09:51:25) I see:
>
> <!ELEMENT FIG - - (OVERLAY*, CAPTION?, %body.content;, CREDIT?) -(FIG|IMG)>
>
> which (and I could well be wrong) I take to mean FIG is a container and
> can contain (0 or more) OVERLAY, (an optional) CAPTION, a required
> body.content (ie marked up text, roughly?) and (an optional) CREDIT.
>
> Is that right?

That's correct.

> However, attempting to check some HTML 3 gives a different story:
>
> <FIG HREF="nicodamus.jpeg">
> <CAPTION>Ground dweller: <I>Nicodamus bicolor</I>
> builds silk snares</CAPTION>
> <P>A small hairy spider light fleshy red in color with a brown abdomen.
> <CREDIT>J. A. L. Cooke/OSF</CREDIT>
> </FIG>
>
> sgmls: SGML error at figures.html3, line 10 at ">":
> FIG end-tag implied by CAPTION start-tag; not minimizable
> sgmls: SGML error at figures.html3, line 10 at ">":
> Out-of-context CAPTION start-tag ended HTML document element (and par
> se)
>
> Huh? I don't understand this and I don't understand how SGMLS gets this from
> the DTD. If some kind soul could explain it to me, I would be grateful.

If you turn on the HTML.Recommended switch,
this problem goes away:

<!DOCTYPE HTML PUBLIC "-//W3O//DTD W3 HTML 3.0//EN//"
[
<!ENTITY % HTML.Recommended "INCLUDE">
]>

Here's what's happening:

With %HTML.Recommended; turned off (the default), %body.content;
expands to include #PCDATA. Because of that, the space right before
the <CAPTION> start-tag is parsed as data instead of a separator,
since data would be legal right here -------v

<!ELEMENT FIG - - (OVERLAY*, CAPTION?, %body.content;, CREDIT?) -(FIG|IMG)>

That's why it's complaining about the <CAPTION> tag --
when it saw the space, it had to assume that it was
#PCDATA in the %body.content; part of the content model,
and the <CAPTION> has to come before that part.

With %HTML.Recommended; turned on, FIG has "element content",
so spaces and record ends are always treated as separators
like you'd expect.

You can also fix the problem by getting rid of all the
spaces between tags:

<FIG SRC="nicodamus.jpeg"><CAPTION>
Ground dweller: <I>Nicodamus bicolor</I> builds silk snares
</CAPTION><P>
A small hairy spider light fleshy red in color with a brown abdomen.
</P><CREDIT>J. A. L. Cooke/OSF
</CREDIT></FIG>

This is the main reason that elements with mixed content
should only be repeatable OR groups; otherwise there
are too many parsing problems just like this one.

Dave, I would suggest changing FIG's content model to
something like:

<!ELEMENT FIG - - (OVERLAY*, CAPTION?, %strict.body.content;, CREDIT?)
-(FIG|IMG)>

where %strict.body.content; always expands to

"(DIV|%heading|%block|HR|ADDRESS)*"

(i.e., no #PCDATA or phrase-level elements), regardless
of whether %HTML.Recommended; is set or not.

(%strict.body.content; isn't the best name, but you
get the idea...)

--Joe English

joe@trystero.art.com