*nix Documentation Project
·  Home
 +   man pages
·  Linux HOWTOs
·  FreeBSD Tips
·  *niX Forums

  man pages->Tru64 Unix man pages -> dechanyu (5)              
Title
Content
Arch
Section
 

dechanyu(5)

Contents


NAME    [Toc]    [Back]

       dechanyu  - A character encoding system (codeset) for Traditional
 Chinese

DESCRIPTION    [Toc]    [Back]

       The DEC Hanyu (dechanyu) codset consists of the  following
       sets  of  characters: ASCII The first and second character
       planes of CNS11643-1986 Digital Taiwan Supplemental  Character
 Set (DTSCS) User-defined characters

       DEC  Hanyu  uses a combination of single-byte data, 2-byte
       data, and 4-byte data to represent ASCII characters,  symbols,
 or ideographic characters.

   ASCII characters    [Toc]    [Back]
       All  ASCII  characters are represented in the form of single-byte,
 7-bit data in DEC Hanyu; that is, the most  significant
  bit  (MSB)  of  a  byte that represents an ASCII
       character is always set off. Refer to  ascii(5)  for  more
       information about the ASCII character set.

   CNS11643-1986 Characters (Planes 1 and 2)    [Toc]    [Back]
       Each  plane of the CNS 11643-1986 character set is divided
       into 94 rows and each of these rows has  94  columns.  The
       characters   defined  in  plane  1  and  plane  2  of  CNS
       11643-1986 are as follows:

       -----------------------------------------------------------------------
       Character Plane   Character Type                   Number  of  Characters

       -----------------------------------------------------------------------
       1                 Special characters               651
                         Control characters               33
                         Frequently used characters       5401
       2                 Less  frequently  used charac-   7650
                         ters
       -----------------------------------------------------------------------

       Note that the first two planes of the CNS11643-1986  character
  set are the same as those specified for the revised
       CNS11643-1992 character set.

       In DEC Hanyu, each CNS 11643-1986 character is represented
       by two bytes, in conformance with the CNS 11643-1986 standard.
 The MSB of the first byte is always turned on  while
       that  of  the  second  byte  is on for the first character
       plane and off for the second character plane.

       The first byte of CNS 11643-1986 encoding  determines  the
       row  number of the character, while the second byte determines
 its column number. Code ranges for the two character
       planes are as follows: A1A1 to FEFE A121 to FE7E

       The  following  formulas  determine  the  value  of  a CNS
       11643-1986 character in relation to  its  row  and  column
       numbers.  For a CNS 11643-1986 Plane 1 character:

              1st byte = A0(hex) + Row number

              2nd  byte  =  A0(hex)  +  Column  number  For a CNS
              11643-1986 Plane 2 character:

              1st byte = A0(hex) + Row number

              2nd byte = 20(hex) + Column number

       For example, if a character is  positioned  at  the  first
       column  of the 36th row on CNS 11643 plane 1, its value is
       C4A1, which is calculated as follows:

       1st byte = A0(hex) + 36 = C4(hex)
       2nd byte = A0(hex) + 01 = A1(hex)

       Similarly, if a character is positioned at the first  column
  of  the  36th  row on CNS 11643 plane 2, its value is
       C421, which is calculated as follows:

       1st byte = A0(hex) + 36 = C4(hex)
       2nd byte = 20(hex) + 01 = 21(hex)


   DTSCS Characters    [Toc]    [Back]
       Currently, only the EDPC (Electronic Data Processing  Centre)
  Recommended  Character Set, which defines a total of
       6319 characters (rows 1 to 68), is included in the Digital
       Taiwan Supplementary Character Set (DTSCS). In the revised
       CNS 11643-1992 standard, the 6319 characters in  the  EDPC
       Recommended  Character  Set  are assigned to the third and
       fourth character planes as follows:

       ---------------------------------------------------------
       EDPC Characters   Character Plane   Number of Characters
       ---------------------------------------------------------
       Part I            Plane 3           6148
       Part II           Plane 4           171
       ---------------------------------------------------------

       The characters defined in Plane  3  and  Plane  4  of  CNS
       11643-1992 are as follows:

       --------------------------------------------------------------------------
       Character Plane   Character Type                           Number      of
                                                                  Characters
       --------------------------------------------------------------------------
       3                 Rarely-used characters (EDPC Part I)     6148
       4                 Used for  residency  system,  ISO  2nd   7298
                         edition  DIS 10646 Han characters, 171
                         EDPC Part II Characters
       --------------------------------------------------------------------------

       In DEC Hanyu, each DTSCS character  is  represented  by  a
       4-byte  value.  The first two bytes are the leading value,
       specifically C2CB, which is used as a designator  sequence
       for  the  DTSCS  character  set.  The MSB of the third and
       fourth bytes is set on for the EDPC Recommended  Character
       Set.

   User-Defined Characters    [Toc]    [Back]
       In addition to the two Chinese character sets described in
       preceding sections, DEC Hanyu provides  an  area  of  3587
       positions for user-defined characters (UDC). The positions
       for UDC are those  positions  that  are  unused  (but  not
       reserved)  code  points  on the first and second character
       planes of CNS 11643-1986.

       The encoding for UDC is  exactly  the  same  as  that  for
       CNS11643-1986  except  that  the  two  sets  of characters
       occupy different regions.  Code  ranges  for  UDC  are  as
       follows:

       -----------------------------------------------
       Character Plane   Number of UDC   Code Range
       -----------------------------------------------
       1                 145             FDCC to FEFE
       1                 2256            AAA1 to C1FE
       2                 1186            F245 to FE7E
       -----------------------------------------------


   Codeset Conversion    [Toc]    [Back]
       The  following  codeset  converter pairs are available for
       converting Traditional Chinese characters between dechanyu
       and  other  encoding formats.  Refer to iconv_intro(5) for
       an introduction to codeset conversion. For  more  information
  about  the  other  codeset for which dechanyu is the
       input or output, see the reference page specified  in  the
       list item.  big5_dechanyu, dechanyu_big5

              Converting  from and to the Big-5 codeset: big5(5).

              Note that  Big-5  encoding  is  equivalent  to  the
              Microsoft  code-page  format used on PCs for Traditional
 Chinese. See  code_page(5)  for  information
              about    PC    code    pages.    dechanzi_dechanyu,
              dechanyu_dechanzi

              Converting from  and  to  the  DEC  Hanzi  codeset:
              dechanzi(5).  eucTW_dechanyu, dechanyu_eucTW

              Converting  from  and  to  Taiwanese  Extended UNIX
              Code: eucTW(5).  telecode_dechanyu,  dechanyu_telecode


              Converting  from and to the Telecode codeset: telecode(5).  UTF-16_dechanyu, dechanyu_UTF-16

              Converting from and to UTF-16  format:  Unicode(5).
              UCS-4_dechanyu, dechanyu_UCS-4

              Converting  from  and  to UCS-4 format: Unicode(5).
              UTF-8_dechanyu, dechanyu_UTF-8

              Converting from and to UTF-8 format: Unicode(5).

   Fonts for DEC Hanyu Characters    [Toc]    [Back]
       The operating system  provides  both  screen  and  printer
       fonts for DEC Hanyu characters.

       The following DECwindows Motif fonts are grouped according
       to character set and family; they  reflect  various  sizes
       and typefaces for 75dpi and 100dpi display devices:

       CNS 11643-1986 Fonts (Hei family):

       -adecw-hei-medium-r-normal--16-160-75-75-m-160-dec.cns11643.1986-2
    -adecw-heimedium-r-normal--24-240-75-75-m-240-dec.cns11643.1986-2

       -adecw-hei-medium-r-normal--16-160-100-100-m-160-dec.cns11643.1986-2
  -adecw-heimedium-r-normal--24-240-100-100-m-240-dec.cns11643.1986-2


       CNS 11643-1986 fonts (Screen family):

       -adecw-screen-medium-rnormal--18-180-75-75-m-160-dec.cns11643.1986-2
     -adecwscreen-medium-r-nor-

       mal--24-240-75-75-m-240-dec.cns11643.1986-2 -adecw-screenmedium-r-normal--18-180-100-100-m-160-dec.cns11643.1986-2

       -adecw-screen-medium-r-normal--24-240-100-100-m-240-dec.cns11643.1986-2
      -adecwscreen-medium-r-nor-

       mal--18-180-100-100-m-160-dec.cns11643.1986-UDC    -adecwscreen-medium-r-nor-

       mal--24-240-100-100-m-240-dec.cns11643.1986-UDC

       CNS 11643-1986 fonts (Sung family):

       -adecw-sung-medium-r-normal--24-240-75-75-m-240-dec.cns11643.1986-2
   -adecw-sungmedium-r-normal--32-320-75-75-m-320-dec.cns11643.1986-2

       -adecw-sung-medium-r-normal--24-240-100-100-m-240-dec.cns11643.1986-2
 -adecw-sungmedium-r-normal--32-320-100-100-m-320-dec.cns11643.1986-2


       DTSCS fonts (Hei family):

       -adecw-hei-medium-r-normal--16-160-75-75-m-160-dec.dtscs.1990-2
       -adecw-heimedium-r-normal--24-240-75-75-m-240-dec.dtscs.1990-2

       -adecw-hei-medium-r-normal--16-160-100-100-m-160-dec.dtscs.1990-2
     -adecw-heimedium-r-normal--24-240-100-100-m-240-dec.dtscs.1990-2


       DTSCS fonts (Screen family):

       -adecw-screen-medium-r-normal--18-180-75-75-m-160-dec.dtscs.1990-2
    -adecw-screenmedium-r-normal--24-240-75-75-m-240-dec.dtscs.1990-2

       -adecw-screen-medium-r-normal--18-180-100-100-m-160-dec.dtscs.1990-2
  -adecw-screenmedium-r-normal--24-240-100-100-m-240-dec.dtscs.1990-2


       DTSCS fonts (Sung family):

       -adecw-sung-medium-r-normal--24-240-75-75-m-240-dec.dtscs.1990-2
      -adecw-sungmedium-r-normal--32-320-75-75-m-320-dec.dtscs.1990-2

       -adecw-sung-medium-r-normal--24-240-100-100-m-240-dec.dtscs.1990-2
    -adecw-sungmedium-r-normal--32-320-100-100-m-320-dec.dtscs.1990-2


       The operating system  provides  the  following  PostScript
       printer  fonts  for  CNS 11643-1986 characters: Hei-LightCNS11643
 Sung-Light-CNS11643

       These PostScript fonts support only the  Traditional  Chinese
 characters in planes 1 and 2 of the CNS 11643 character
 set. The Traditional Chinese characters in  the  DTSCS
       character  set  are  not  supported by printer fonts.  The
       restriction also applies to the eucTW codeset, which  also
       includes  DTSCS  characters  and  is supported by the same
       fonts as dechanyu.

       For general information on printing Asian  language  text,
       refer to i18n_printing(5).





SEE ALSO    [Toc]    [Back]

      
      
       Commands: locale(1)

       Others:   ascii(5),   big5(5),  Chinese(5),  code_page(5),
       dechanzi(5), eucTW(5), GBK(5), i18n_intro(5),  i18n_printing(5),  iconv_intro(5),  l10n_intro(5),  sbig5(5),  telecode(5)



                                                      dechanyu(5)
[ Back ]
 Similar pages
Name OS Title
gbk Tru64 A character encoding system (codeset) for Simplified Chinese
GBK Tru64 A character encoding system (codeset) for Simplified Chinese
dechanzi Tru64 A character encoding system (codeset) for Simplified Chinese
big5 FreeBSD ``Big Five'' encoding for Traditional Chinese text
iso8859-1 Tru64 A character encoding system (codeset)
ISO8859-1 Tru64 A character encoding system (codeset)
ISO8859-7 Tru64 A character encoding system (codeset) for Greek
iso-2022-jp Tru64 A character encoding system (codeset) for Japanese
ISO-2022-JP Tru64 A character encoding system (codeset) for Japanese
jiskanji Tru64 A character encoding system (codeset) for Japanese
Copyright © 2004-2005 DeniX Solutions SRL
newsletter delivery service