*nix Documentation Project
·  Home
 +   man pages
·  Linux HOWTOs
·  FreeBSD Tips
·  *niX Forums

  man pages->IRIX man pages -> colltbl (1)              



colltbl(1M)							   colltbl(1M)

NAME    [Toc]    [Back]

     colltbl - create collation	database

SYNOPSIS    [Toc]    [Back]

     colltbl [ file | -	]

DESCRIPTION    [Toc]    [Back]

     The colltbl command takes as input	a specification	file, file, that
     describes the collating sequence for a particular language	and creates a
     database that can be read by strxfrmtrcmp(3C), strncmp(3C),
     or	memcmp(3C).  strcoll(3C) transforms its	arguments and does a

     If	no input file is supplied, stdin is read.

     The output	file produced contains the database with collating sequence
     information in a form usable by system commands and routines.  The	name
     of	this output file is the	value you assign to the	keyword	codeset	read
     in	from file.  Before this	file can be used, it must be installed in the
     /usr/lib/locale/locale directory with the name LC_COLLATE by someone who
     is	super-user or a	member of group	bin.  locale corresponds to the
     language area whose collation sequence is described in file.  This	file
     must be readable by user, group, and other; no other permissions should
     be	set.  To use the collating sequence information	in this	file, set the
     LC_COLLATE	environment variable appropriately [see	environ(5) or

     The colltbl command can support languages whose collating sequence	can be
     completely	described by the following cases:

     o	 Ordering of single characters within the code set.  For example, in
	 Swedish, V is sorted after U, before X, and with W (V and W are
	 considered identical as far as	sorting	is concerned).

     o	 Ordering of ``double characters'' in the collation sequence.  For
	 example, in Spanish, ch and ll	are collated after c and l,

     o	 Ordering of a single character	as if it consists of two characters.
	 For example, in German, the ``sharp s,'' B, is	sorted as ss.  This is
	 a special instance of the next	case below.

     o	 Substitution of one character string with another character string.
	 In the	example	above, the string B is replaced	with ss	during

     o	 Ignoring certain characters in	the code set during collation.	For
	 example, if - were ignored during collation, then the strings
	 re-locate and relocate	would compare as equal.

									Page 1

colltbl(1M)							   colltbl(1M)

     o	 Secondary ordering between characters.	 In the	case where two
	 characters are	sorted together	in the collation sequence, (that is,
	 they have the same "primary" ordering), there is sometimes a
	 secondary ordering that is used if two	strings	are identical except
	 for characters	that have the same primary ordering.  For example, in
	 French, the letters e and e have the same primary ordering but	e
	 comes before e	in the secondary ordering.  Thus the word lever	would
	 be ordered before lever, but lever would be sorted before levitate.
	 (Note that if e came before e in the primary ordering,	then lever
	 would be sorted after levitate.)

     The specification file consists of	three types of statements:

     1.	 codeset   filename

	 filename is the name of the output file to be created by colltbl.

     2.	 order is  order_list

	 order_list is a list of symbols, separated by semicolons, that
	 defines the collating sequence.  The special symbol, ..., specifies
	 symbols that are lexically sequential in a short-hand form.  For
	      order is	a;b;c;d;...;x;y;z

	 would specify the list	of lowercase letters.  Of course, this could
	 be further compressed to just a;...;z.

	 A symbol can be up to two bytes in length and can be represented in
	 any one of the	following ways:

	 o   the symbol	itself (for example, a for the lowercase letter	a),

	 o   in	octal representation (for example, \141	or 0141	for the	letter
	     a), or

	 o   in	hexadecimal representation (for	example, \x61 or 0x61 for the
	     letter a).

	 Any combination of these may be used as well.

	 The backslash character, \ , is used for continuation.	 No characters
	 are permitted after the backslash character.

	 Symbols enclosed in parentheses are assigned the same primary
	 ordering but different	secondary ordering.  Symbols enclosed in curly
	 brackets are assigned only the	same primary ordering.	For example,

	      order is	a;b;c;ch;d;(e;e);f;...;z;\

									Page 2

colltbl(1M)							   colltbl(1M)

	 In the	above example, e and e are assigned the	same primary ordering
	 and different secondary ordering, digits 1 through 9 are assigned the
	 same primary ordering and no secondary	ordering.  Only	primary
	 ordering is assigned to the remaining symbols.	 Notice	how double
	 letters can be	specified in the collating sequence (letter ch comes
	 between c and d).

	 If a character	is not included	in the order is	statement, it is
	 excluded from the ordering and	will be	ignored	during sorting.

     3.	 substitute string <b>with	repl

	 The substitute	statement substitutes the string string	with the
	 string	repl.  This can	be used, for example, to provide rules to sort
	 the abbreviated month names numerically:

	      substitute "Jan" with "01"
	      substitute "Feb" with "02"
	      substitute "Dec" with "12"

	 A simpler use of the substitute statement would be to substitute a
	 single	character with two characters, as with the substitution	of B
	 with ss in German.

     The substitute statement is optional.  The	order is and codeset
     statements	must appear in the specification file.

     Any lines in the specification file with a	# in the first column are
     treated as	comments and are ignored.  Empty lines are also	ignored.

EXAMPLE    [Toc]    [Back]

     The following example shows the collation specification required to
     support a hypothetical telephone book sorting sequence.

     The sorting sequence is defined by	the following rules:

     a.Upper- and lowercase letters must be sorted together, but uppercase
       letters have precedence over lowercase letters.

     b.All special characters and punctuation should be	ignored.

     c.Digits must be sorted as	their alphabetic counterparts (for example, 0
       as zero,	1 as one).

     d.The Ch, ch, CH combinations must	be collated between C and D.

									Page 3

colltbl(1M)							   colltbl(1M)

     e.V and W,	v and w	must be	collated together.

     The input specification file to colltbl will contain:

	       codeset	 telephone

	       order is	 A;a;B;b;C;c;CH;Ch;ch;D;d;E;e;F;f;\
	       substitute "0" with "zero"
	       substitute "1" with "one"
	       substitute "2" with "two"
	       substitute "3" with "three"
	       substitute "4" with "four"
	       substitute "5" with "five"
	       substitute "6" with "six"
	       substitute "7" with "seven"
	       substitute "8" with "eight"
	       substitute "9" with "nine"

FILES    [Toc]    [Back]

		     LC_COLLATE	database for locale

		     input file	used to	construct LC_COLLATE in	the default

SEE ALSO    [Toc]    [Back]

     memory(3C), setlocale(3C),	strcoll(3C), string(3C), strxfrm(3C),

									PPPPaaaaggggeeee 4444
[ Back ]
 Similar pages
Name OS Title
strcoll IRIX string collation
dev_mkdb OpenBSD create /dev database
makedbm OpenBSD create a YP database
strcoll FreeBSD compare strings according to current collation
strcoll OpenBSD compare strings according to current collation
strcoll NetBSD compare strings according to current collation
strcoll Tru64 Compares strings using locale collation
newkey FreeBSD create a new key in the publickey database
mk-amd-map FreeBSD create database maps for Amd
makewhatis FreeBSD create whatis database
Copyright © 2004-2005 DeniX Solutions SRL
newsletter delivery service