That's up to the client. I would probably make it a user preference.
>> This RFC specifies a document character set which is used in the
>> interpretation of characters in the document entity and in the
>> entities referenced from the document entity. This document
>> character set is ISO/IEC 10646-1:1993.
>
>Perhaps specify UCS-4? (As opposed to UCS-2.)
Um. ISO 10646 is not an encoding...