profil(2) profil(2)
profil - execution time profile
int profil (unsigned short *buff, unsigned int bufsiz,
unsigned int offset, unsigned int scale);
profil provides CPU-use statistics by profiling the amount of CPU time
expended by a program. profil generates the statistics by creating an
execution histogram for a current process. The histogram is defined for
a specific region of program code to be profiled, and the identified
region is logically broken up into a set of equal size subdivisions, each
of which corresponds to a count in the histogram. With each clock tick,
the current subdivision is identified and its corresponding histogram
count is incremented. These counts establish a relative measure of how
much time is being spent in each code subdivision. The resulting
histogram counts for a profiled region can be used to identify those
functions that consume a disproportionately high percentage of CPU time.
buff is a buffer of bufsiz bytes in which the histogram counts are stored
in an array of unsigned short int.
offset, scale, and bufsiz specify the region to be profiled.
offset is effectively the start address of the region to be profiled.
scale, broadly speaking, is a contraction factor that indicates how much
smaller the histogram buffer is than the region to be profiled. More
precisely, scale is interpreted as an unsigned 16-bit fixed-point
fraction with the decimal point implied on the left. Its value is the
reciprocal of the number of instructions in a subdivision, per counter of
histogram buffer.
Several observations can be made:
- the maximal value of scale, 0x10000, gives a 1-1 mapping of pc's
to 4-byte words in buff. This value makes no sense for the
default 16-bit (2-byte) counters, since this means 2 bytes of
wasted space for every 2 bytes of counter.
- the minimum value of scale (for which profiling is performed),
0x0002 (1/32,768), maps (by convention) the entire region
starting with offset to the first counter in buff.
- a scale value of 0x8000 specifies a contraction of two, which
means every 4-byte instruction maps to 2 bytes of counter space.
For 16-bit counters this means a 1-1 mapping of instructions to
counters. For 32-bit counters (see sprofil(2)) this means two
4-byte instructions map to each 4-byte counter.
Page 1
profil(2) profil(2)
The values are used within the kernel as follows: when the process is
interrupted for a clock tick (once every 10 milliseconds), the value of
offset is subtracted from the current value of the program counter (pc),
and the remainder is multiplied by scale to derive a result. That result
is used as an index into the histogram array to locate the cell to be
incremented. Therefore, the cell count represents the number of times
that the process was executing code in the subdivision associated with
that cell when the process was interrupted.
Since each cell is only 16 bits wide, it is conceivable for it to
overflow. No indication that this has occurred is given. If an overflow
is likely, it is recommended that the alternative sprofil(2) be used,
which supports a superset of profil(2) functionality and supports using
optional 32-bit cells.
sprofil(2), prof(1), times(2), monitor(3X), fork(2), sproc(2).
Profiling is turned off by giving a scale of 0 or 1, and is rendered
ineffective by giving a bufsiz of 0. Profiling is turned off when an
exec(2) is executed, but remains on in both child and parent processes
after a fork (2) or sproc(2). Profiling is turned off if a buff update
would cause a memory fault.
A 0, indicating success, is always returned.
PPPPaaaaggggeeee 2222 [ Back ]
|