Re: Suppressed content in HEAD: myth or reality?

Dan Connolly (connolly@w3.org)
Thu, 4 May 95 03:11:14 EDT

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Dan Connolly: "Re: ISO/IEC 10646 as Document Character Set"
Previous message: Gavin Nicol: "Re: ISO/IEC 10646 as Document Character Set"

Eric Bina writes:
>
> > In some circles, the conventional wisdom is "if you find data
> > characters anywhere in the <HEAD> element, don't display them in the
> > same text window where you show the <body> stuff."
>
> Any good suggestions on what to do when the user opens a <HEAD> and forgets to
> close it?

You mean besides: parse as per SGML?

> At what point should we assume the closure of the <HEAD>? We can't
> do it at the first thing that the browser thinks shouldn't be there, because
> in that case, a browser that didn't understand <STYLE>data</STYLE> would
> assume closure of the HEAD as soon as it saw "data". I suppose we can assume
> it at the <BODY>, but we still leave the opportunity for a doc with <HEAD>,
> not </HEAD> and no <BODY> which would just disappear into limbo inside the
> browser. It sure worries me.

Blech. OK. You're talking about how the "if you don't recognize the
tag, throw it out" convetion interacts with "if you're in HEAD, don't
show data". Creative kludging.

Here goes nothing:

HTML -> <html>?, HEAD, BODY, </html>?

HEAD -> <head>?, head-content, </head>?

head-content -> TITLE | META | LINK | ...
| UNKNOWN-HEAD

TITLE -> <title>, data chars, </title>

META -> <meta>

UNKNOWN-HEAD -> <xxx>, unknown-head-content, </yyy>

unknown-head-content -> data chars
| head-content

BODY -> <body>?, body-content, </body>?

body-content -> data chars | H1 | H2 | UL | OL | P | ...
| UNKNOWN-BODY

unknown-body -> <xxx> | </yyy>

I believe that's an LR(1) grammar, i.e. it's implementable. It
requires that you keep track of how many unknown tags you're inside
when you're in the head, but not their names.

Hmmm... it also implies we can't add any empty tags to the
head. Otherwise it becomes ambiguous whether data chars are in the
head or the body. Bad news for future innovations.

This is a pretty good argument for saying "all data characters except
TITLE" are in the body. Corrollary: all style info has to go in
attributes or linked documents.

Blech.

Oh well.

Dan

Next message: Dan Connolly: "Re: ISO/IEC 10646 as Document Character Set"
Previous message: Gavin Nicol: "Re: ISO/IEC 10646 as Document Character Set"