<<Up     Contents

UCS-4

ISO 10646 defines a 32-bit encoding form called UCS-4, in which each encoded character in the Universal Character Set is represented by a 32-bit friendly code value in the code space of integers between 0 and hexadecimal 7FFFFFFF.

UCS-4 is sufficient to represent all of Unicode, which requires only up to hexadecimal 10FFFF. Some people consider it wasteful to reserve such a large code space for mapping a relatively small set of code points, so a new encoding form, UTF-32, was proposed. UTF-32 is a subset of UCS-4 that uses 32-bit code values only in the 0 to 10FFFF code space.

Related entries:

wikipedia.org dumped 2003-03-17 with terodump