GBK, gbk - A character encoding system (codeset) for Simplified
Chinese
The GBK character set is an extension to the GB 2312-80
character set. (The "K" in "GBK" is the first sound in the
Chinese word "Kuo Zhan," which means "extension.") GBK
includes all the Hanzi characters specified by the ISO
10646-1:1993 standard (characters also known as the GB
13000.1.93 character set) that are not included in GB
2312-80. GBK is therefore defined as a normative annex of
GB13000.1-93.
GBK Value Ranges and Code Points [Toc] [Back]
The GBK codeset is divided into five levels, as follows:
-----------------------------------------------------------
Level Encoding Range Code Points
Characters
-----------------------------------------------------------
GBK/1 0xA1A1-0xA9FE 846 717
GBK/2 0xB0A1-0xF7FE 6,768 6,763
GBK/3 0x8140-0xA0FE 6,080 6,080
GBK/4 0xAA40-0xFE40 8,160 8,160
GBK/5 0xA840-0xA9A0 192 166
-----------------------------------------------------------
In addition, GBK includes code points for user-defined
characters, as follows:
-----------------------------
Encoding Range Code Points
-----------------------------
0xAAA1-0xAFFE 564
0xF8A1-0xFEFE 658
0xA140-0xA7A0 672
-----------------------------
GBK therefore provides a total of 23,940 code points,
21,886 of which are assigned.
Each row in the GBK code table consists of 190 characters.
ASCII characters, which are single-byte characters, are
defined in the range 0x21-0x7E. Encoding ranges for twobyte
characters are as follows:
Encoding range for the first byte: 0x81-0xFE
Encoding ranges for the second byte: 0x40-0x7E and
0x80-0xFE
Note
In terms of character-to-code allocation, the sub-range
for GB2321-80 characters (0xA1A1-0xFEFE) in GBK is the
same encoding range defined for these characters in
Extended UNIX Code (EUC). GBK is therefore backward compatible
with Chinese EUC encoding as well as forward compatible
with the encoding as defined by ISO 10646-1:1993.
GBK is the standard character set and encoding used in the
Simplified Chinese version of Windows 95.
Codeset Converters for GBK [Toc] [Back]
The following codeset converter pairs are available for
converting Simplified Chinese characters between GBK and
UCS formats. Refer to Unicode(5) for more information
about the UTF-16, UCS-4, and UTF-8 encoding formats.
Refer to iconv_intro(5) for an introduction to codeset
conversion. UTF-16_GBK, GBK_UTF-16
Converting from and to UTF-16 format UCS-4_GBK,
GBK_UCS-4
Converting from and to UCS-4 format UTF-8_GBK,
GBK_UTF-8
Converting from and to UTF-8 format
Fonts for GBK [Toc] [Back]
The following set of Simplified Chinese TrueType fonts are
installed as the operating system default fonts for GBK:
-css_dongwen-fangsong-medium-r-normal--0-0-0-0-c-0-gbk-1
-css_dongwen-fangsong-medium-r-normal--0-0-0-0-c-0-iso8859-1
-css_dongwen-heiti-medium-r-normal--0-0-0-0-c-0-gbk-1
-css_dongwen-heiti-medium-r-normal--0-0-0-0-c-0-iso8859-1
-css_dongwen-kaiti-medium-r-normal--0-0-0-0-c-0-gbk-1
-css_dongwen-kaiti-medium-r-normal--0-0-0-0-c-0-iso8859-1
-css_dongwen-songti-medium-r-normal--0-0-0-0-c-0-gbk-1
-css_dongwen-songti-medium-r-normal--0-0-0-0-c-0-iso8859-1
The following set of Simplified Chinese TrueType fonts are
available as an installation option: -huatian-fangsongmedium-r-normal--0-0-0-0-c-0-gbk-1
-huatian-fangsongmedium-r-normal--0-0-0-0-m-0-iso8859-1
-huatian-heiti-medium-r-normal--0-0-0-0-c-0-gbk-1 -huatian-heiti-medium-r-normal--0-0-0-0-m-0-iso8859-1
-huatian-kaiti-medium-r-normal--0-0-0-0-c-0-gbk-1 -huatian-kaiti-medium-r-normal--0-0-0-0-m-0-iso8859-1
-huatian-songti-medium-r-normal--0-0-0-0-c-0-gbk-1 -huatian-songti-medium-r-normal--0-0-0-0-m-0-iso8859-1
The default and optional fonts can be used for printing
only with Chinese text printers. With either the default
or optional font sets installed, the SongTi fonts are the
default screen fonts for the GBK codeset. See wwpsof(8)
for information on the PostScript print filter and TrueType
fonts.
Commands: locale(1)
Others: ascii(5), big5(5), Chinese(5), dechanyu(5),
dechanzi(5), eucTW(5), GB18030(5), i18n_intro(5),
i18n_printing(5), l10n_intro(5), sbig5(5), telecode(5)
GBK(5)
[ Back ] |