Perhaps specify UCS-4? (As opposed to UCS-2.)
> This RFC does not specify the actual character set or character
> encoding scheme used in the representation of the document entity
> or any referenced entity. It is the responsibility of communicating
> agents to agree upon an actual character set or encoding scheme.
> The manner in which such an agreement is negotiated is outside the
> scope of this RFC.
If the RFC is intended to be the spec for the "text/html" Internet
Media Type (as used in MIME and HTTP), then it should say *something*
about charset.
Erik