*nix Documentation Project
·  Home
 +   man pages
·  Linux HOWTOs
·  FreeBSD Tips
·  *niX Forums

  man pages->IRIX man pages -> wsregexp (3w)              


wsregexp(3W)							  wsregexp(3W)

NAME    [Toc]    [Back]

     wsregexp:	wsrecompile, wsrestep, wsrematch, wsreerr - Wide character
     based regular expression compile and match	routines

SYNOPSIS    [Toc]    [Back]

     #include <wsregexp.h>
     #include <widec.h>

     long wsrecompile(struct rexdata *prex, long *expbuf,
			 long *endbuf, wchar_t eof);
     int  wsrestep(struct rexdata *prex, wchar_t *wstr,	long *expbuf);

     int  wsrematch(struct rexdata *prex, wchar_t *wstr, long *expbuf);
     char *wsreerr(int err);

DESCRIPTION    [Toc]    [Back]

     These functions are general purpose internationalized regular expression
     matching routines to be used in programs that perform regular expression
     matching.	These functions	are defined by the wsregexp.h header file.

     The function wsrecompile takes as input an	internationalized regular
     expression	as defined below (apart	from the normal	regular	expressions as
     defined by	regexp)	and produces a compiled	expression that	can be used
     with wsrestep or wsrematch.
	  struct rexdata {
	       short	 sed;	   /* flag for sed */
	       wchar_t	 *str;	   /* regular expression */
	       int	 err;	   /* returned error code, 0 = no error	*/
	       wchar_t	 *loc1;
	       wchar_t	 *loc2;
	       int	 circf;

     The first parameter, prex,	is a pointer to	the specification of the
     regular expression. prex->sed should be non-zero if sed style delimiter
     syntax is to be adopted. prex->str	should point to	the regular expression
     that needs	to be compiled.	The regular expression string should be	in
     wide character format. prex->err indicated	any error during the
     compilation and use of this regular expression.  expbuf points to the
     place where the compiled regular expression will be placed. endbuf	points
     to	the first long after the space where the compiled regular expression
     may be placed.  (endbuf-expbuf) should be large enough for	the compiled
     regular expression	to fit.	 eof is	the wide character which marks the end
     of	the regular expression.	 This character	is usually a / (slash).

     If	wsrecompile was	successful, it returns the pointer to the end of the
     regular expression, endbuf. Otherwise, 0 is returned and the error	code
     is	set in prex->err.

									Page 1

wsregexp(3W)							  wsregexp(3W)

     The functions wsrestep and	wsrematch do pattern matching given a null
     terminated	wide character string wstr and a compiled regular expression
     expbuf as input. expbuf for these functions should	be the compiled
     regular expression	which was obtained by a	call to	the function

     The function wsrestep returns non-zero if some substring of wstr matches
     the regular expression in expbuf and zero if there	is no match.  The
     function wsrematch	returns	non-zero if a substring	of wstr	starting from
     the beginning matches the regular expression in expbuf and	zero if	there
     is	no match.  If there is a match,	prex->loc1 and prex->loc2 are set.
     prex->loc1	points to the first wide character that	matched	the regular
     expression; prex->loc2 points to the wide character after the last	wide
     character that matches the	regular	expression.  Thus if the regular
     expression	matches	the entire input string, prex->loc1 will point to the
     first wide	character of wstr and prex->loc2 will point to the null	at the
     end of wstr.

     wsrestep uses the variable	circf of struct	rexdata	which is set by
     wsrecompile if the	regular	expression begins with ^ (caret). If this is
     set then wsrestep will try	to match the regular expression	to the
     beginning of the string only. If more than	one regular expression is to
     be	compiled before	the first is executed, the value of prex->circf	should
     be	saved for each compiled	expression and should be set to	that saved
     value before each call to wsrestep.

     wsreerr returns the error message corresponding to	the error code in the
     language of the current locale. The error code err	should be one returned
     by	the wsregexp functions in the err variable of struct rexdata.

     The internationalized regular expressions available for use with the
     wsregexp functions	are constructed	as follows:

     Expression	 Meaning

     c		 the character c where c is not	a special character.

     [[:class<b>:]] class is any character	type as	defined	by the LC_TYPE locale
		 category. class can be	one of the following

		 alpha	 a letter

		 upper	 an upper-case letter

		 lower	 a lower-case letter

		 digit	 a decimal digit

		 xdigit	 a hexadecimal digit

									Page 2

wsregexp(3W)							  wsregexp(3W)

		 alnum	 an alphanumeric character

		 space	 any whitespace	character

		 punct	 a punctuation character

		 print	 a printable character

		 graph	 a character that has a	visible	representation

		 cntrl	 a control character

     [[=c<b>=]]	 An equivalence	class, or, any collation element defined as
		 having	the same relative order	in the current collation
		 sequence as c.	 As an example,	if A and a belong to the same
		 equivalence class, then both [[=A=]b]]	and [[=a=]b]] are
		 equivalent to [Aab].

     [[.cc<b>.]]	 This represents a multi-character collating symbol.  Multicharacter
 collating elements must be represented as collating
		 symbols to distinguish	them from single-character collating
		 elements. As an example, if the string	ab is a	valid
		 collating element, then [[.ab.]] will be treated as an
		 element and will match	the same string	of characters, while
		 ab will match the list	of characters a	and b. If the multicharacter
 collating symbol is not a valid collating element
		 in the	current	collating sequence definition, the symbol will
		 be treated as an invalid expression.

     [[c<b>-c<b>]]	 Any collation element in the character	expression range c-c,
		 where c can identify a	collating symbol or an equivalence
		 class.	 If the	character - (hyphen) appears immediately after
		 an opening square bracker, e.g. [-c], or immediately prior to
		 a closing square bracket, e.g.	[c-], it has no	special

     Immediately following an opening square bracket ^ means the complement
     of, e.g. [^c]. Otherwise, it has no special meaning.

     Within square brackets, a . that is not part of a [[.cc.]]	 sequence, or
     a : that is not part of a [[:class:]] sequence, matches itself.

SEE ALSO    [Toc]    [Back]


DIAGNOSTICS    [Toc]    [Back]

     Errors are:

	  ERR_NORMBR		   no remembered search	string

									Page 3

wsregexp(3W)							  wsregexp(3W)

	  ERR_REOVFLOW		   regexp overflow
				   This	happens	when wsrecompile cant fit the
				   compiled regular expression in (endbuf-

	  ERR_BRA		   ( ) imbalance

	  ERR_DELIM		   illegal or missing delimiter.

	  ERR_NBR		   bad number in { }

	  ERR_2MNBR		   more	than 2 numbers given in	{ }

	  ERR_DIGIT		   digit out of	range

	  ERR_2MLBRA		   too many (

	  ERR_RANGE		   range number	too large

	  ERR_MISSB		   } expected after \

	  ERR_BADRNG		   first number	exceeds	second in { }.

	  ERR_SIMBAL		   [ ] imbalance.

	  ERR_SYNTAX		   illegal regular expression

	  ERR_ILLCLASS		   illegal [:class<b>:]

	  ERR_EQUIL		   illegal [=class<b>=]

	  ERR_COLL		   illegal [.cc<b>.]

EXAMPLE    [Toc]    [Back]

     The following is an example of how	the regular expression macros and
     calls might be defined by an application program:

	  #include <wsregexp.h>
	  #include <widec.h>

	   . . .
	  struct rexdata rex;
	  long expbuf [BUFSIZ];	     /*	Buffer for the compiled	RE */

	  /* Define a RE to identify a capitalized word	*/
	  char *regexp = "[[:space:]][[:upper:]]";
	  wchar_t wregexp [512];
	  wchar_t weof;		     /*	The end	of regular expression */
	  char eof = '\0';

	  wchar_t linebuf [BUFSIZ];  /*	Buffer for the input string */
	   . . .
	  (void) mbstowcs(wregexp, regexp, strlen(regexp)+1);

									Page 4

wsregexp(3W)							  wsregexp(3W)

	  (void) mbtowc(&weof, &eof, 1);
	  rex.str = wregexp;
	  rex.sed = 0;
	  rex.err = 0;
	  if (!wsrecompile(&rex, expbuf, &expbuf[BUFSIZ], weof))
		  fprintf(stderr, "%s\n", wsreerr(rex.err));
	   . . .
	  if (wsrestep(&rex, linebuf, expbuf))

									PPPPaaaaggggeeee 5555
[ Back ]
 Similar pages
Name OS Title
step_r Tru64 Regular expression compile and match routines
regexp Tru64 Regular expression compile and match routines
regexpr IRIX regular expression compile and match routines
compile_r Tru64 Regular expression compile and match routines
compile Tru64 Regular expression compile and match routines
regexp IRIX regular expression compile and match routines
step Tru64 Regular expression compile and match routines
advance Tru64 Regular expression compile and match routines
advance_r Tru64 Regular expression compile and match routines
regcmp IRIX regular expression compile
Copyright © 2004-2005 DeniX Solutions SRL
newsletter delivery service