> Japanese text on the World-Wide Web (when served using ISO-2022-JP)
> may contain special characters like <, >, and &. Commonly, it appears
> that people leave these characters in their text, and then others have
> to fix their browsers[1] to interpret markup characters only outside
> of JIS text.

Correct. This shouldn't be a large change to the browser, actually. Handling
non-delimited two-byte characters (Shift-JIS and EUC) is much more annoying.
With ISO 2022 you just treat anything not in ASCII or JIS Roman as a character,
not markup.

> >From my experience, the former treatment is more widespread than the
> latter. But the latter ensures that there is no chance of documents
> breaking parsers, while occasionally these problems occur in the
> former case. Does this mean the latter is more correct?

I think they are both stopgap measures. I am strongly in favor of using
ISO 10646 with language tags, but of the older schemes ISO 2022 is much
better than non-delimited schemes. As long as you can unambiguously tell
what size each character is, it's not too hard to make your parser handle
wide characters. Having to guess is a pain in the neck.

If you are concerned with not "breaking" non-multilingual browsers, please
also allow multilingual ones to do the right thing without having to be
preconfigured for a particular encoding. This is the biggest headache
with trying to support Japanese right now, and I'd hate to see it happen
with Chinese, Korean, or other languages.

