I don't understand the conjecture "characters are variable, while
bytes are static."
Perhaps you mean that, e.g., that the byte-length of the UTF-8
encoding of a string doesn't vary linearly with number of characters
in the string. That doesn't make it any less precise to specify
lengths in characters.
This is an interesting issue, and the spec doesn't really make it
clear: HTML is two abstractions at once: an SGML application, defined
in terms of characters, and a MIME content type, defined in terms of
bytes.
The link is the assumed/missing/controversial "charset" parameter
which specifies how you take a MIME body of type text/html, that is, a
sequence of bytes, and translated it into an SGML entity, that is, a
sequence of characters.
In HTML 2.0, the charset parameter is (implicitly) "iso-latin-1" which
has a well-defined meaning in both the MIME and SGML camps.
The "HTML and MIME" and/or "HTML and SGML" sections should make this
clear, I suppose.
If I had my druthers, though, we sould cite the MIME and SGML specs as
normative references, provide the DTD and the MIME Content-Type
registration info, and be done with it. These terms are defined quite
nicely in the respective documents. It's really painful to reproduce the
SGML specification and the MIME specification in this HTML document.
Call me a minimalist.
Dan