Re: Last Call: Hypertext Markup Language - 2.0 to Proposed Standard

Joe English (joe@trystero.art.com)
Wed, 26 Jul 95 14:35:04 EDT

"Christopher R. Maden" <crm@ebt.com> wrote:

> Roy Fielding wrote:
> [...]
> > The correct definition for this feature is:
> [...]
> > <!ENTITY % linkExtraAttributes
> > "REL NAMES #IMPLIED
> [...]
>
> That's not quite right either. Production [40] of ISO 8879 allowes a
> name token list to be a list of names separated by SPACE, usually
> character 32. This excludes line breaks, tabs, and other whitespace.

But see also section 7.9.3 (pp. 331-332 in Goldfarb) which says:

An _attribute value literal_ is interpreted as
an _attribute value_ by replacing references within it,
ignoring Ee and RS, and replacing an RE or SEPCHAR
with a SPACE.
[...]
An attribute value other than _character data_ is
tokenized by replacing a sequence of SPACE characters
with a single SPACE character and ignoring leading or
trailing SPACE characters.

It is my understanding that the cited production

[40] name token list =
name token,
(SPACE,
name token)*

is applied to the interpreted attribute value, not the
original attribute value literal in the start-tag.

SGMLS appears to agree with this interpretation:

trystero:joe% cat t.sgml
<!doctype test [
<!element test - O EMPTY>
<!attlist test
a NAMES #REQUIRED
>
]>
<test a="foo bar baz qux
qwerty
asdf">
trystero:joe% sgmls t.sgml
AA TOKEN FOO BAR BAZ QUX QWERTY ASDF
(TEST
)TEST
C
trystero:joe%

> I think CDATA might be a better choice, as it would allow more
> flexibility for casual authors.

Then the case-sensitivity issue rears its ugly head...

--Joe English

joe@art.com