>    This RFC specifies a document character set which is used in the
    >    interpretation of characters in the document entity and in the
    >    entities referenced from the document entity.  This document
    >    character set is ISO/IEC 10646-1:1993.
    Perhaps specify UCS-4?  (As opposed to UCS-2.)
The encoding form is irrelevant.  By definition, specifying 10646 as
the document character set provides potential access to all coded characters
in the standard.
    >    This RFC does not specify the actual character set or character
    >    encoding scheme used in the representation of the document entity
    >    or any referenced entity. It is the responsibility of communicating
    >    agents to agree upon an actual character set or encoding scheme.
    >    The manner in which such an agreement is negotiated is outside the
    >    scope of this RFC.
    If the RFC is intended to be the spec for the "text/html" Internet
    Media Type (as used in MIME and HTTP), then it should say *something*
    about charset.
The HTML RFC should not say anything for the reason that Larry pointed out.
The encoding form is related only to the representation of an entity;
therefore, it is is unrelated to the document character set.
The right place to specify this is in the HTTP RFC and/or along with other
transport specs.
Glenn