Re: Unicode browsers (was: Re: Comments on: "Character Set" Considered

Amanda Walker (amanda@intercon.com)
Mon, 1 May 95 22:43:56 EDT

bobj@netscape.com writes:
> Supporting canonical Unicode will require major changes to parser and
> layout engines.

To be more accurate, supporting UCS-2 (which is what I assume you mean
by "canonical Unicode") *may* require major changes, if the parser and
layout engines are hardwired for 8-bit character codes and poorly
structured. I consider this to be entirely the problem of whoever
has implemented such a parser, since the problem exists for Japanese and
Chinese even now.

Some of us, of course, have the advantage of having a parser and layout
engine that were designed with Unicode in mind :), but it's still a lot
easier than juggling multiple encodings and character code widths...

> supporting ASCII-superset encodings is relatively easy
> and in many case more efficient. UTF8 is an ASCII-superset and would
> fall in the easy to support bucket.

I cannot construct a scenario in my mind in which supporting UTF-8 is
significantly simpler than supporting any other representation of
Unicode. In fact, it takes a little work to imagine situations where
it's even as easy as UCS-2. The only advantage UTF-8 might have is
interoperability with current content.

Amanda Walker
InterCon Sysems Corporation