lockinfo - Displays kernel locking statistics for SMP and
NUMA platforms
/usr/sbin/lockinfo [-since_boot] [-class=lock_class]
[-top=count] [-sort=statistic] [-rad=rad_id] [-percpu]
[command [cmd_args]]
Specifies a particular lock class name (for example,
thread.lock) for which trace information is to be displayed.
This option adds to the general information displayed
about all lock classes detailed information about
where in the kernel code the specified lock class is
asserted.
For an alternative way to request trace information,
see the entry for the -top option. Displays
statistics for each lock on a per-CPU basis. Displays
lock statistics for the specified RAD.
This option is useful only on NUMA systems. The
lockinfo command prints an error message if the
specified RAD does not exist.
The -rad option is available in the version of the
lockinfo software included in the Tru64 UNIX product
kit, starting with Version 5.1B. However, this
version of lockinfo can also be used on Versions
5.1 and 5.1A. Displays locking statistics for the
system since it was booted. To use this option, the
system must have been booted with the lockmode
attribute set to 4. See sys_attrs_generic(5) for
more information about the lockmode attribute.
Sorts output statistics from highest to lowest for
the specified statistic. The value for statistic
can be one of the following, which also correspond
to the column headings in the lockinfo display: The
number of tries for asserting the lock. (Default)
The number of attempts on read. The maximum number
of readers in the critical path (for a C class
lock) or the maximum number of cycles spent holding
the lock (for an S, RWS, and MCS class lock). The
number of lock misses. The number of blocks
encountered while waiting for the lock. The maximum
amount of time (in seconds) spent waiting for a
lock. The total (in seconds) of all times spent
waiting for the lock. The lock miss percent (corresponds
to the percent misses column in the display).
You can also specify none for statistic. In this
case, lockinfo sorts output in the order in which
the lock classes are defined in the kernel source.
(This is not a particularly useful option.) Causes
lockinfo to do an initial pass of statistics gathering
and then perform lock class tracing for each,
in turn, of the asserted locks that sort in the top
count from the first pass. When you specify the
-top option, any command you pass to lockinfo will
execute count+1 times.
To request trace information for one specific lock
class, see the entry for the -class option.
The command to be executed by lockinfo. Any arguments to
the preceding command.
The command and cmd_args operands are used to limit the
length of time that lockinfo runs. Typically, sleep is
specified for command and some number of seconds is specified
for cmd_args.
The lockinfo utility collects and displays locking statistics
for the kernel SMP locks. It uses the /dev/lockdev
pseudo driver to collect data. Locking statistics can be
gathered when the lockmode attribute for the generic subsystem
is set to 2 (the default for SMP systems), 3, or 4.
Be sure to review RESTRICTIONS for important information
about support for this utility.
When you enter a lockinfo command, the utility first opens
the lockdev pseudo driver and turns on lock statistics
gathering. Then, the utility forks and executes the specified
command. After command completes, the utility turns
off lock statistics gathering (closes lockdev), collects
the data, and sends it to stdout. The output data shows
the SMP locking that was done by the operating system during
the execution time for command.
If you do not include command to set a time limit on the
interval for gathering statistics, the statistics shown
are for the length of time since the last system boot.
To gather statistics with lockinfo, you typically follow
these steps: Start up a system work load and wait for it
to get to a steady state. Start lockinfo with sleep as
the specified command and some number of seconds as the
specified cmd_args. This causes lockinfo to gather statistics
for the length of time it takes the sleep command to
execute. Based on the first set of results, use lockinfo
again to request more specific information about any lock
class that shows results, such as a large percentage of
misses, that you believe is very likely to cause a system
performance problem.
For example, the following command causes lockinfo to collect
locking statistics for 60 seconds: # lockinfo sleep
60
The output from this command might look like this:
hostname: sysname.node.corp.com lockmode: 2
(SMP default) processors: 4 start time: Wed Jun 9
14:38:05 1999 end time: Wed Jun 9 14:39:05 1999
command: sleep 60
tries reads trmax misses percent sleeps
waitmax waitsum
misses
seconds seconds bsBuf.bufLock (S)
5718642 0 45745 194509 3.4 0
0.00007 0.63226 lock.l_lock (S)
5579643 0 40985 75656 1.4 0
0.00005 0.16531 thread.lock (S)
1989132 0 24817 21795 1.1 0
0.00003 0.03864 vnode.v_lock (S)
1578583 0 49207 1527 0.1 0
0.00002 0.00443 vdT.vdIoLock (S)
1412449 0 149797 81078 5.7 0
0.00017 0.51438 BsBufHashLock (S)
1377903 0 42586 89312 6.5 0
0.00006 0.24563
. . .
inifaddr_lock (C)
1 1 1 0 0.0 0
0.00000 0.00000
total simple_locks = 28545191
percent unknown = 0.0 total rws_locks = 1429 percent
reads = 100.0 total complex_locks = 2764296 percent
reads = 33.2 percent unknown = 0.0
#
The first six lines of output specify the system, its
lockmode attribute setting, how many processors it has,
the start and end times of the statistics gathering, and
the command that was run.
Following the initial six lines, a table with statistics
data for each lock class is displayed. See the description
of the -sort option for an explanation of the data in
each column. In this example, the lock class entries in
the table are sorted by the number of tries (the default
sort order).
Note that each lock class name is tagged (in parentheses)
with its lock type. The following lock types are supported:
---------------------------------------
Tag Lock Type
---------------------------------------
(C) Complex lock
(MCS) Queued lock (NUMA system only)
(RCS) Reader/Writer spin lock
(S) Simple spin lock
---------------------------------------
Finally, the display ends with some summary information:
the total number of different types of locks and the read
percentages for each type.
When diagnosing a system problem, certain statistics are
more important than others to look at. The following
results are most likely indicate a problem: A large number
of tries A percent misses value that is too high.
"Too high" varies somewhat, depending on the lock
class. A kernel developer who is testing VM code
under development might consider any percentage
over 1 percent "too high" for certain kinds of
locks. A support representative testing released
product software should look for percent misses
values that exceed the range of 5 to 7 percent. A
large waitsum value
If you see these results, there are no hard and fast rules
to tell you whether the lock contention is poor code
design in applications that are running on the system or
whether there is a problem in the operating system software.
A locking problem is simply an indication that
there is high contention for a certain type of resource.
If contention exists for a lock related to I/O and a particular
application is spawning many processes that compete
for the same files and directories, application or
database storage design adjustments might be in order.
Applications that use System V semaphores can sometimes
encounter locking contention if they create a very large
number of semaphores in a single semaphore set because the
kernel uses locks on each set of semaphores. In this
case, performance improvements might be realized by changing
the application to use more semaphore sets, each with
a smaller number of semaphores. Contention for locks on
other kinds of resources, such as memory, is less likely
to be remedied by changing application code; however, further
investigation has to be done before you can rule out
a problem in the application.
Keep in mind that even when lock contention can be reduced
by changing applications and this option might be the only
short-term solution, it does not necessarily mean that the
problem should not be reported to central engineering.
For example, if the applications were not causing locking
problems when run on an SMP system but do on a NUMA system
with comparable CPU and memory resources, support representatives
should report the case to central engineering.
In this case, kernel developers will want to investigate
system algorithms that better handle the contention.
In the first example of lockinfo output, a few locks show
high values in the tries, percent misses, and waitsum
columns. The following example uses lockinfo to show the
code paths where one of these, bsBuf.bufLock, is asserted
with high frequency: # lockinfo -class=sBuf.bufLock sleep
60 hostname: sysname.node.corp.com lockmode: 2
(SMP default) processors: 4 start time: Wed Jun 9
15:02:39 1999 end time: Wed Jun 9 15:03:39 1999
command: sleep 60
Locks asserted by PC for lock class: bsBuf.bufLock
count miss caller : line #
return : line #
-------------------------------------------------------------------------------
733418 41275 bs_pinpg_one_int: 4494
bs_pinpg_clone: 4125 704579 49182
bs_pinpg_one_int: 4460 bs_pinpg_clone: 4125
697209 0 find_page: 5986
bs_pinpg_one_int: 4368 680910 30359
bs_unpinpg: 5461 log_donerec_nunpin: 3149 544828
12869 bs_q_lazy: 2006
bs_q_list: 1142 496294 15537 bs_unpinpg:
5670 log_donerec_nunpin: 3149 357840 0
find_page: 5986 bs_refpg_int: 2800 115502
21058 bs_derefpg: 3982
bmtr_update_rec_int: 2473
. . .
7 0 seq_ahead_cont: 6879
bs_pinpg_one_int: 4568
2 0 bs_derefpg: 3982
frag_group_dealloc: 1637
tries reads trmax misses percent sleeps
waitmax waitsum
misses
seconds seconds bsBuf.bufLock (S) 5322157 0
45745 366848 6.9 0 0.00009 1.77041
. . .
inifaddr_lock (C)
1 1 1 0 0.0 0
0.00000 0.00000
total simple_locks = 26331214
percent unknown = 0.0 total rws_locks = 1265 percent
reads = 100.0 total complex_locks = 2544907 percent
reads = 33.2 percent unknown = 0.0
Based on the information returned about high frequency
code path assertions for the sBuf.bufLock lock, the kernel
developer can then look for ways to reduce the amount of
locking for this class. Strategies for reaching this goal
might include one or more of the following: Changing the
code to use a read/write spin lock rather than a simple
spin lock Reducing lock hold times by moving some work
outside the time that the lock is being held Making more
radical changes in kernel algorithms to reduce the frequency
of lock assertions or the amount of time that locks
are held
The support representative with training in operating system
internals might be interested in tracing high frequency
code path assertions to better determine if a system
performance problem requires submission of a problem
report on the operating system software or if changes need
to be made in third-party or site-specific applications.
See EXAMPLES for examples of lockinfo commands that
include the -percpu and -rad options.
Privileges for using the lockinfo command are based on
permissions for the /dev/lockdev file. By default, permissions
on this file mean that using the lockinfo command
requires superuser privilege. System administrators can
change the permissions on the /dev/lockdev special file to
allow "others" to gather lockinfo statistics without
granting a particular user superuser privileges.
The lockinfo command is indirectly invoked by the
sys_check script. The owner and group for the /dev/lockdev
file should not be changed; in other words, the file must
be owned by root and remain in group mem. Furthermore,
user root and group mem must have read permission on the
file. For security reasons, do not run lockinfo in setuid
root mode or add nonprivileged users to group mem. (The
latter action gives nonprivileged users read permissions
on physical memory.)
The version of lockinfo software included in Tru64 UNIX
Version 5.1B and higher releases includes NUMA support
(gathering of statistics on a per-RAD basis). Support representatives
can copy this version of lockinfo to NUMA
platforms running Tru64 UNIX Version 5.1 or Version 5.1A,
but should be careful to protect the version of lockinfo
already installed on the system by using one of the following
strategies: Copy the new version to and run it from
a directory other than /usr/sbin. (For example, copy the
new version of lockinfo to /usr/local/bin.) Copy the new
version of lockinfo to /usr/sbin but under a different
utility name (such as lockinfo_2).
Protecting the version of lockinfo that originally shipped
with the system is important because of the sys_check
dependency.
Running the lockinfo command can impact system performance.
However, the performance impact exists only for
the length of time that statistics are being gathered (the
length of time it takes for the specified command to execute).
Only one instance of lockinfo can be running on the system
at a particular time. This is because a process cannot
open the /dev/lockdev pseudo device when it is already
opened by another process.
The lockinfo command is intended for use mainly by other
facilities distributed with the operating system, kernel
developers, and support representatives. It is not
intended for use by customers who are not working with a
support representative to diagnose a performance problem.
Furthermore, lockinfo should not be invoked by customer
scripts and applications for the following reasons: The
utility displays information that is difficult to interpret
without training in operating system internals; many
lock class names are not intuitively obvious; that is, the
name does not suggest an association with a particular
kernel subsystem or the type of system resource being
locked. The utility interface and output is subject to
change, without advance notice, from one release to
another and by patches that might be applied to the utility.
Success. An error occurred.
Lock class class_name not asserted
Explanation: The name specified for the -class option is a
supported lock class for the system. However, no trace
information is available; during the time that locking
statistics were being gathered, the specified lock was not
asserted.
User Action: Verify class names in output from lockinfo
that is entered without the -class option. You also might
need to better synchronize the statistics-gathering period
with a time that the lock class is being asserted in order
to generate trace information. This may take a few tries
or an increase in the number of seconds that statistics
are gathered.
lock statistics not supported in lockmode 0 or 1
Explanation: The lockmode attribute must be set to 2, 3.
or 4 for the lockinfo command to work. The default for
SMP systems is 2, so someone reset the value to 0 or 1
when the system was booted.
User Action: The attribute value must be changed to 2, 3,
or 4 and the system rebooted before lockinfo will work.
open: Device busy
Explanation: Another process has the /dev/lockdev device
open. The other process might be running lockinfo directly
or be running the sys_check script (which indirectly
invokes lockinfo).
User Action: Wait until the first instance of lockinfo has
stopped running and try the command again.
option since_boot only supported in lockmode 4
current lockmode: lockmode_value
Explanation: The system must have been booted with lockmode
set to 4 in order to start gathering locking statistics
at system boot time. Because of performance implications
for production systems, this is usually done only on
a development system.
User action: Reset lockmode to 4 and reboot the system or,
if that is not possible, use lockmode to gather statistics
for short time periods after performance problems are
being noticed.
top count too big count. Max value 1024
Explanation: A number larger than 1024 was specified for
the -top argument.
User action: Reduce the value. You typically gather trace
information for only the top few lock classes that display
the largest values for the specified sort criteria.
rad number rad_id is invalid
Explanation: The specified RAD number is less than 0 or
greater than the number of RADs on the system.
User action: Re-enter the lockinfo command with a corrected
RAD identifier.
Rad rad_id has no active processors
Explanation: The specified RAD exists on the system but
the RAD has no active processors. Therefore, there are no
locking statistics to display for that RAD. Note that the
lockinfo command does not find processors if they are
installed but not running. The CPUs in the specified RAD
might have been in the process of being taken on or off
line when lockinfo was run.
User action: Make sure that you specified the number for a
RAD in which you expect CPUs to be active. If not, reenter
the command with a corrected RAD number. To check on
processor status, use the psrinfo command.
top number of countlarger then number of lock types number
Explanation: The -top argument exceeded the supported number
of lock class names.
User action: Reduce the value. You usually gather trace
information for only the top few lock classes that display
the largest values for the specified sort criteria.
Unknown class name: class_name
Explanation: An invalid lock class name was specified for
the -class option.
User Action: Verify class names in output from lockinfo
when entered without the -class option. Check the
spelling and letter case for supported names and re-enter
the command with a corrected -class argument.
The following lockinfo command gathers locking statistics
for each processor over a period of 60 seconds: # lockinfo
-percpu sleep 60 hostname: sysname.node.corp.com
lockmode: 4 (SMP DEBUG with kernel mode preemption
enabled) processors: 4 start time: Wed Jun 9
14:45:08 1999 end time: Wed Jun 9 14:46:08 1999
command: sleep 60
tries reads trmax misses percent
sleeps waitmax waitsum
misses
seconds seconds bsBuf.bufLock (S)
0 1400786 0 45745 47030 3.4
0 0.00007 0.15526
1 1415828 0 45367 47538 3.4
0 0.00006 0.15732
2 1399462 0 33076 48507 3.5
0 0.00005 0.15907
3 1398336 0 31753 48867 3.5
0 0.00005 0.15934
-----------------------------------------------------------------------
ALL 5614412 0 45745 191942 3.4
0 0.00007 0.63099
lock.l_lock (S)
0 1360769 0 40985 18460 1.4
0 0.00005 0.04041
1 1375384 0 20720 18581 1.4
0 0.00005 0.04124
2 1375122 0 20657 18831 1.4
0 0.00009 0.04198
-----------------------------------------------------------------------
ALL 5483049 0 40985 74688 1.4
0 0.00009 0.16526
. . .
inifaddr_lock (C)
0 0 0 1 0 0.0
0 0.00000 0.00000
1 1 1 1 0 0.0
0 0.00000 0.00000
2 0 0 1 0 0.0
0 0.00000 0.00000
3 0 0 1 0 0.0
0 0.00000 0.00000
-----------------------------------------------------------------------
ALL 1 1 1 0 0.0
0 0.00000 0.00000
total simple_locks = 28100338
percent unknown = 0.0 total rws_locks = 1466 percent
reads = 100.0 total complex_locks = 2716146 percent
reads = 33.2 percent unknown = 0.0
# The following lockinfo command gathers locking statistics
for a particular RAD on a NUMA platform. In this
example, only cumulative statistics across all CPUs in the
RAD are displayed.
# /usr/sbin/lockinfo -rad=4 sleep 10 hostname:
sysname.node.corp.com locktype: MCS Locks
lockmode: 2 (SMP default) Tracing:
RAD 4 only processors: 4 start time: Thu
May 17 09:15:09 2001 end time: Thu May 17
09:15:19 2001 command: sleep 10
tries reads trmax misses misses
sleeps waitmax waitsum
percent
seconds seconds processor.callout_lock (M)
1298 0 0 0 0.0
0 0.00000 0.00001 thread.lock (M)
336 0 0 0 0.0
0 0.00000 0.00000 processor.lock (M)
226 0 0 1 0.4
0 0.00000 0.00016 pag.idle_lock (M)
124 0 0 0 0.0
0 0.00000 0.00002 wait_queue.lock (M)
114 0 0 0 0.0
0 0.00000 0.00000 unknown_simple_lock (M)
76 0 0 0 0.0
0 0.00000 0.00000 misc_tcp_lock (M)
70 0 0 0 0.0
0 0.00000 0.00000 pag.lock (M)
10 0 0 0 0.0
0 0.00000 0.00000 cam_softc (M)
10 0 0 0 0.0
0 0.00000 0.00000 vm_control.vm_free_lock (M)
10 0 0 0 0.0
0 0.00000 0.00000 cam_x_pqhead1 (M)
8 0 0 0 0.0
0 0.00000 0.00000 pmap.lock (M)
6 0 0 0 0.0
0 0.00000 0.00000 proc.p_lock (M)
2 0 0 0 0.0
0 0.00000 0.00000 nxm_vp_lock (M)
2 0 0 0 0.0
0 0.00000 0.00000 task.ipc_translation_lock (M)
1 0 0 0 0.0
0 0.00000 0.00000 nxm_thread_lock (M)
1 0 0 0 0.0
0 0.00000 0.00000 kern_port.port_data_lock (M)
1 0 0 0 0.0
0 0.00000 0.00000 port_hash_bucket.lock (M)
1 0 0 0 0.0
0 0.00000 0.00000
total mcs_locks = 2296
percent unknown = 0.0 total rws_locks = 0 percent
reads = 0.0 total complex_locks = 0 percent
reads = 0.0 percent unknown = 0.0
The following lockinfo command gathers locking
statistics for a particular RAD on a NUMA platform.
In this example, per-CPU statistics are included as
well as the total for all CPUs in the RAD.
# /usr/sbin/lockinfo -rad=4 -percpu sleep 10 hostname:
sysname.node.corp.com locktype:
MCS Locks lockmode: 2 (SMP default) Tracing:
RAD 4 only processors: 4 start time: Thu
May 17 09:16:33 2001 end time: Thu May 17
09:16:43 2001 command: sleep 10
tries reads trmax misses misses
sleeps waitmax waitsum
percent
seconds seconds processor.callout_lock (M)
16 110 0 0 0 0.0
0 0.00000 0.00000
17 92 0 0 0 0.0
0 0.00000 0.00000
18 100 0 0 0 0.0
0 0.00000 0.00000
19 992 0 0 0 0.0
0 0.00000 0.00000
-----------------------------------------------------------------------
ALL 1294 0 0 0 0.0
0 0.00000 0.00003
thread.lock (M)
16 54 0 0 0 0.0
0 0.00000 0.00000
17 30 0 0 0 0.0
0 0.00000 0.00000
18 162 0 0 0 0.0
0 0.00000 0.00000
19 90 0 0 0 0.0
0 0.00000 0.00000
-----------------------------------------------------------------------
ALL 336 0 0 0 0.0
0 0.00000 0.00000
processor.lock (M)
16 37 0 0 0 0.0
0 0.00000 0.00000
17 18 0 0 0 0.0
0 0.00000 0.00000
18 108 0 0 0 0.0
0 0.00000 0.00000
19 63 0 0 0 0.0
0 0.00000 0.00000
-----------------------------------------------------------------------
ALL 226 0 0 0 0.0
0 0.00001 0.00022
pag.idle_lock (M)
16 21 0 0 0 0.0
0 0.00000 0.00000
17 12 0 0 0 0.0
0 0.00000 0.00000
18 54 0 0 0 0.0
0 0.00000 0.00000
19 37 0 0 1 2.7
0 0.00001 0.00001
-----------------------------------------------------------------------
ALL 124 0 0 1 0.8
0 0.00001 0.00003
wait_queue.lock (M)
16 21 0 0 0 0.0
0 0.00000 0.00000
17 2 0 0 0 0.0
0 0.00000 0.00000
18 54 0 0 0 0.0
0 0.00000 0.00000
19 37 0 0 0 0.0
0 0.00000 0.00000
-----------------------------------------------------------------------
ALL 114 0 0 0 0.0
0 0.00000 0.00000
unknown_simple_lock (M)
16 56 0 0 0 0.0
0 0.00000 0.00000
17 0 0 0 0 0.0
0 0.00000 0.00000
18 20 0 0 0 0.0
0 0.00000 0.00000
19 0 0 0 0 0.0
0 0.00000 0.00000
-----------------------------------------------------------------------
ALL 76 0 0 0 0.0
0 0.00000 0.00000
. . .
total mcs_locks = 2292
percent unknown = 0.0 total rws_locks = 0 percent
reads = 0.0 total complex_locks = 0 percent
reads = 0.0 percent unknown = 0.0
Pseudo driver that is opened by the lockinfo utility for
statistics gathering.
Commands: sched_stat(8), sys_check(8)
Others: sys_attrs_generic(5)
lockinfo(8)
[ Back ] |