Re: META

Joe English (joe@trystero.art.com)
Tue, 4 Jul 95 16:27:23 EDT

Eric Bina <ebina@netscape.com> wrote:

> I would LOVE to make Netscape ignore content in the HEAD if someone
> could show me a way to do it without breaking tons of current documents
> and without breaking LEGAL HTML documents.

It would suffice to ignore the content of unrecognized
elements (i.e., those not in HTML 2.0) until it can be
determined that the <BODY> element has started.

Untagged character data should imply the beginning of the <BODY>.
For tagged data, either the element is a known HEAD element
(in which case the content should be ignored, i.e., <TITLE>),
it is a known BODY element (in which case </HEAD><BODY> should be
inferred and the content displayed), or it is a new element
(in which case the browser should assume that it belongs
to the <HEAD> unless some prior tag or character data
has implied </HEAD><BODY>.)

It's safe to assume that the <HEAD> element will not
include #PCDATA in its content model in any future
revision, and that any character data in the <HEAD>
will be enclosed in another element.

Example 1:

<html>
<title>Blah</title>
<h1>Blah</h1> <!-- H1 is a known body element; infer </HEAD><BODY> -->

Example 2:

<html>
<title>Blah</title>
blah <!-- character data can only appear in body; infer </HEAD><BODY> -->

Example 3:

<!doctype html PUBLIC "-//IETF//DTD HTML Experimental//">
<html>
<head>
<title>blah</title>
<style>blah...</style>
<!-- <STYLE> unknown; don't infer </HEAD><BODY>; ignore content -->
</head>
<body>
<newel>blah</newel>
<!-- <NEWEL> unknown, but <BODY> has been seen; include content -->

This heuristic will break on things like:

<html>
<title>Blah</title>
<newel>blah</newel>

where NEWEL is supposed to be a body element,
but authors who use experimental features (which
by definition NEWEL must be since it's not in HTML 2)
without including the <HEAD> ... </HEAD> and/or <BODY>
tags should expect to lose.

This heuristic does work on the common case cited earlier:

> 1. <!-- select doctype above... -->
> 2. <HEAD>
> 3. <TITLE><!-- your title here --></TITLE>
> 4.
> 5. abc
> 6. <!-- your HTML test data -->
> 7. </BODY>

since the character data "abc" implies </HEAD><BODY>.

--Joe English

joe@art.com