libperfex - IRIX

· Home

+ man pages

-> Linux

-> FreeBSD

-> OpenBSD

-> NetBSD

-> Tru64 Unix

-> HP-UX 11i

-> IRIX

· Linux HOWTOs

· FreeBSD Tips

· *niX Forums

man pages->IRIX man pages -> libperfex (3c)

NAME
C SYNOPSIS
DESCRIPTION
DIAGNOSTICS
RESTRICTIONS
FILES
DEPENDENCIES
SEE ALSO


LIBPERFEX(3C)							 LIBPERFEX(3C)

NAME [Toc] [Back]

     libperfex,	start_counters,	read_counters, print_counters, print_costs,
     load_costs	- A procedural interface to processor event counters

C SYNOPSIS [Toc] [Back]

	 int start_counters( int e0, int e1 );
	 int read_counters( int	e0, long long *c0, int e1, long	long *c1);
	 int print_counters( int e0, long long c0, int e1, long	long c1);
	 int print_costs( int e0, long long c0,	int e1,	long long c1);
	 int load_costs(char *CostFileName);


FORTRAN	SYNOPSIS
	  INTEGER*8 c0,	c1
	  INTEGER   e0,	e1
	  CHARACTER(*n)	CostFileName
	  INTEGER*4 function start_counters( e0, e1 )
	  INTEGER*4 function read_counters( e0,	c0, e1,	c1 )
	  INTEGER*4 function print_counters( e0, c0, e1, c1 )
	  INTEGER*4 function print_costs( e0, c0, e1, c1 )
	  INTEGER*4 function load_costs( CostFileName )

DESCRIPTION [Toc] [Back]

     These routines provide simple access to the hardware event	counters.  The
     arguments e0 and e1 are int types specifying which	events to count.  For
     descriptions of the counters themselves, see the perfex(1)	or the
     r10k_counters(5) man page.

     The counts	are returned in	the long long arguments	c0 and c1. The
     print_counters routine prints the counts to standard error.  Two events
     which must	be counted on the same hardware	counter	will cause a
     conflicting counters error. The arguments e0 and e1 can be	overridden by
     setting the environment variables T5_EVENT0 and T5_EVENT1.	 Calls to
     start_counters implicitly zero out	the internal software counters before
     starting them, while read_counters	implicitly stops the counters after
     reading them. Thus	if you want to accumulate counts over multiple
     start/read	calls, you must	save out the counts and	do the accumulation
     yourself. Each of these incurs a typical system call overhead which can
     amount to a few hundred microseconds per start/stop call pair. The
     print_counters procedure is just a	formatting convenience and has no
     effect on the state of the	counters.

TIME ESTIMATES AND COST	TABLES
     The print_costs routine prints the	counts together	with approximate time
     estimates as described for	perfex.	 The load_costs	procedure allows a
     cost table	to be loaded from a file.  Cost	ranges reflect both the	width
     of	the cost distribution and uncertainty in the degree of overlap.	 Costs
     for different events are not exclusive due	to the overlap of one type of
     event with	others.	 By default a table of costs for each event
     appropriate to the	host system is used.  If the file /etc/perfex.costs is
     present, it is used instead. If a table is	loaded using load_costs, it is
     used instead. Partial tables may be used; the remaining entries taking



									Page 1






LIBPERFEX(3C)							 LIBPERFEX(3C)



     the appropriate default values. The format	of the cost table and how to
     dump the default cost table for the current system	are described in the
     perfex manpage.

DIAGNOSTICS [Toc] [Back]

     Normal completion returns a positive integer, the generation number.
     Negative return values signal an error.  The underlying operating system
     counter interface may produce other errors.  Among	these are indications
     that the counter resource is in use, possibly because of monitoring by
     another user on the system	(with root privileges) in system mode.	The
     generation	numbers	increment on a per-process basis each time the
     counters are enabled or stopped.  Because the counters may	have been used
     by	a supervisor-privilege process after being enabled by the user with
     start_counters, but before	being read by read_counters, all correct uses
     of	the interface must check that the generation number returned by
     read_counters matches the generation number from the corresponding
     start_counters, for each process.	For example, To	collect	instruction
     and data scache miss counts on a program normally executed	by
     int e0,e1
     long long c0,c1;
     int gen_start, gen_read

     /*	... set	e0 and e1  ... */

     if((gen_start=start_counters(e0,e1)) < 0) {
	 perror("start_counters");
       }

     /*	user code */

     if((gen_read=read_counters(e0,&c0,e1,&c1))<0) {
	perror("read_counters");
       }

     if(gen_read != gen_start) {
	perror("lost counters!,	aborting...");
	exit(1);
       }

     /*	do something with c0,c1	*/
     illustrates correct use. Counter measurements are typically not precisely
     reproducible for most events.  For	example, counts	of cache misses	depend
     on	the change in the cache	population due to other	processes when the
     target process is not running.  Invalidation and intervention counts may
     depend on the temporal ordering of	memory accesses	by different
     processors, which is not guaranteed.  Issued instruction counts depend on
     the R10000's dynamic scheduling of	instructions, which in turn can	depend
     on	the pattern of instruction cache misses.  Since	the cache misses are
     affected by the runtime environment as noted above, this count is also
     not precisely reproducible.  In almost all	cases, however,	while the
     exact count may not be reproducible, the amount of	time attributable to
     any fluctuations is a tiny	fraction of the	execution time.	 Conversely,



									Page 2






LIBPERFEX(3C)							 LIBPERFEX(3C)



     any event which accounts for an important fraction	of the execution time
     will have a count with small relative fluctuations	from run to run.

RESTRICTIONS [Toc] [Back]

     This interface is not reentrant or	(pthread) thread-safe.	The
     start_counters, read_counters, print_counters and print_costs, routines
     can be called by individual sproc threads.	 However there is only
     provision for a single global cost	table, so load_costs should be called
     only in a sequential region, and that cost	table must be applied to the
     counts from all threads.

FILES [Toc] [Back]

     /usr/lib/libperfex.so /usr/lib32/libperfex.so /usr/lib64/libperfex.so

DEPENDENCIES [Toc] [Back]

     These procedures are only available on systems with hardware performance
     counters (systems with R10000 or R12000 processors).  Use on systems with
     mixed processor types will	have undefined results in systems with mixed
     processor types.

     Manipulating the counters with these routines and simultaneously through
     perfex or the OS counter interface	ioctl procedures can produce an	error
     if, for example, the counters are enabled twice without being released in
     the interim.

Contents

NAME [Toc] [Back]

C SYNOPSIS [Toc] [Back]

DESCRIPTION [Toc] [Back]

DIAGNOSTICS [Toc] [Back]

RESTRICTIONS [Toc] [Back]

FILES [Toc] [Back]

DEPENDENCIES [Toc] [Back]

SEE ALSO [Toc] [Back]