This is not a hole in the spec. The documents in question do _not_
conform to the spec. Hence, implementations may do whatever they
choose.
It's getting _really_ boring and tedious (not to mention hazardous to
the future extensibility of the web) specifying how HTML user agents
should treat all these error conditions.
> I feel strongly that this should
>not be the case, and that we should add some simple language to the spec
>specifying how browsers should handle these small characters.
Please suggest something. I think you'll find it difficult to craft
language that's useful but not overly restrictive.
>> I don't have the SGML spec handy
>I don't have it handy either, and while I will have to soon, I don't
>really want to have to refer to it for something simple like this. Why
>should I have to?
Life is hard. HTML is an application of SGML. Get used to it.
>The way character sets and code pages have evolved has made the current
>usage of characters and fonts extremely confusing. This leads to lots of
>errors in implementation, and confuses the heck out of users and browser
>developers alike.
>
>Let's get rid of this confusion now, so that the path towards
>internationalization is easier.
I agree there is confusion in most discussions regarding the term
"character set." That's why I wrote "'Character Set' Considered Harmful".
I hope that it will clarify the terminology, at least as it's used
in the HTML and MIME specs.
>And yes, I am going to write up some proposed language to deal with this,
>and see how people react.
I suggest you do.
Dan