Perhaps specify UCS-4?  (As opposed to UCS-2.)
>    This RFC does not specify the actual character set or character
>    encoding scheme used in the representation of the document entity
>    or any referenced entity. It is the responsibility of communicating
>    agents to agree upon an actual character set or encoding scheme.
>    The manner in which such an agreement is negotiated is outside the
>    scope of this RFC.
If the RFC is intended to be the spec for the "text/html" Internet
Media Type (as used in MIME and HTTP), then it should say *something*
about charset.
Erik