spike - Performs code optimization after linking a program
spike binary_file [options...]
Determines the frequency cutoff point for the alignment of
profile-based basic blocks. Valid threshold values are
floating-point numbers between 0 and 1, inclusive. A
higher number means more basic blocks will be quadword
aligned. [Default: 0.95] Specifies the version of the
Alpha architecture for which to generate instructions. See
cc(1) for information about the possible values of option
and for a comparison of -arch and -tune. The default
option is ev4. Spike will accept binaries that contain
instructions that require an architectural extension not
present in the processor specified by -arch. However,
Spike will assume that the instructions are guarded by
code that prevents their execution on some systems and
will restrict some optimizations. For best results, use an
appropriate -arch option. Specifies new data segment
starting address, where n is a 64-bit hexadecimal number
without the leading "0x". Enables DCPI profile-based
prefetching (only works with feedback profile). Controls
DCPI profile-based prefetching. The number (n) describes
the minimum latency (in cycles) that a load must have to
become a candidate for prefetching. [Default: 50] Controls
DCPI profile-based prefetching. Valid threshold values are
floating-point numbers between 0 and 1, inclusive. The
number is used to determine the frequency cutoff for loads
to be prefetched; a higher number means more loads will be
prefetched. [Default: 0.75] Causes Spike to use the feedback
database stored in file, where file is the name of
the input executable. This database is created by first
compiling the program with the -feedback option (for example,
cc -feedback prog) and then instrumenting and running
the program with the pixie -update or prof -pixie -update
command (see cc(1), pixie(1), and prof(1)). Causes Spike
to use file.Addrs (basic block addresses file) and
file.Counts (basic block counts file) for profile-based
optimization. These files are produced by the pixie tool
(see pixie(1) and prof(1)). Prints a short help screen.
Use this option when applying Spike to the UNIX kernel
(vmunix). Spike can be applied only to V5.1 or later kernels.
Reduces the number of padding nops inserted into
the code to align instructions. The alignment usually
makes the code run faster, but makes the code larger,
which can cause more instruction cache misses. Disables
basic block chaining, which arranges code so that the fall
through path is the commonly taken path. Invokes the
original Spike and not the DTK version. Disables procedure
ordering. Disables code layout optimization that
splits procedures into multiple parts. Specify Spike's
optimization level. These flags are provided for compatibility
with other compilation tools and currently have no
effect. Names the optimized binary output file. The
default file name is a.out. Specifies that obsolete
linker-defined symbols are to be ignored. (See RESTRICTIONS.)
Determines the frequency cutoff point for profile-based
optimizations. Valid threshold values are
floating-point numbers between 0 and 1, inclusive. A
higher number means more routines will be optimized.
[Default: 0.99] Determines the frequency cutoff point for
profile-based routine splitting. Valid threshold values
are floating-point numbers between 0 and 1, inclusive. A
higher number means more code is considered hot and less
code is considered cold. [Default: 0.95] Enables stride
prefetching based on a profile of data-address strides
collected by using the Pixie tool. This optimization
mainly targets programs in which many data cache misses
occur inside loops. Keeps unreachable routines from being
deleted by Spike if they have an entry in the symbol
table. Specifies new text segment starting address, where
n is a 64-bit hexadecimal number without the leading "0x".
Instructs the optimizer to tune the application for a specific
version of the Alpha architecture. See cc(1) for
information about the possible values of option and for a
comparison of -tune and -arch. The default option is ev6.
Displays the version number of Spike. Enables extra warning
messages.
Name of the binary file to which Spike is to be applied.
Spike is a tool for performing code optimization after
linking. It is a replacement for om and does similar optimizations.
Because it can operate on an entire program,
Spike is able to do optimizations that cannot be done by
the compiler.
Some of the optimizations that Spike performs are code
layout, deleting unreachable code, and optimization of
address computations. Spike is most effective when it uses
profile information to guide optimization.
Spike can process binaries linked on Tru64 UNIX (formerly
Digital UNIX) Version 4.0 or later systems. Binaries that
are linked on Version 5.1 or later systems contain information
that allows Spike to do additional optimization.
You can use Spike in two ways: By applying the spike command
to a binary file after compilation. As part of the
compilation process, by specifying the -spike option with
the cc command (or the cxx, f77, or f90 command, if the
associated compiler is installed).
The -spike option is more convenient when you are not
using profile information (Example 2), or you are using
profile information in the compiler, too (Example 3). The
spike command is more convenient if you do not want to
relink the executable (Example 1) or you are using profile
information after compilation (Examples 4 and 5).
All spike command options can be passed directly to the cc
command's -spike option by using the cc command's -WS
option. Example 6 shows the syntax.
Spike cannot process the following images: Images that
have been stripped. Images that contain certain obsolete
linker-defined symbols and structures such as RPDR tables
(see Section 2.3.7, Special Symbols, of the Object
File/Symbol Table Format Specification). This can be overruled
by using the -obsolete_linkerdefs option, but the
resulting binary files may be incorrect, so use with
caution. Images that modify the text section at run time.
Using cord, atom, pixie, hiprof, or third on an image that
has been processed with Spike is unsupported.
Spike tries to update the symbol table in the binary so
that the optimized binary can be debugged. As with other
compiler optimizations, there may be some situations where
the debugger may not be able to properly report the current
location in the program or display the values of
variables. If Spike divides a procedure into multiple disjoint
parts, the main body will keep the original procedure
name, but the other parts will have names that are
the original name with _cold_n (where n is a unique number)
appended to the end.
In the following example, Spike is applied to the binary
my_prog, producing the optimized output file prog1.opt: %
spike my_prog -o prog1.opt In the following example, Spike
is applied during compilation with the cc command's -spike
option: % cc -c file1.c % cc -o prog3 file1.
The first command line creates the object file
file1.o. The second command line links file1.o
into an executable image and uses Spike to optimize
the executable image. The following example shows
how to optimize a program, prog, by first compiling
it with the -feedback option, then merging profiling
statistics from two instrumented runs of the
program, and then compiling it with the -spike and
-feedback options so that the feedback information
stored in the executable image is used by the compiler
and Spike: % cc -feedback prog -o prog *.c %
pixie -pids prog % prog.pixie (input set 1) %
prog.pixie (input set 2) % prof -pixie -update prog
prog.Counts.* % cc -spike -feedback prog -o prog
*.c
The first compilation produces an augmented executable
image that will later accept feedback
information.
The pixie command creates an instrumented program
(prog.pixie), which is then run twice. The -pids
option adds the process ID of each test run to the
name of the profiling data file produced -- for
example, prog.Counts.371 and prog.Counts.422.
The prof -pixie command merges the two data files.
The -update option updates the executable image,
prog, with the combined information.
The program is compiled with the -spike and -feedback
options so the feedback information stored in
the executable image is used by the compiler and
Spike. The following example shows how to optimize
a program, prog, by first compiling it with the
-feedback option, then merging profiling statistics
from two instrumented runs of the program, and then
applying the spike -feedback command to use the
feedback information stored in the executable
image: % cc -feedback prog -o prog *.c % pixie
-pids prog % prog.pixie (input set 1) % prog.pixie
(input set 2) % prof -pixie -update prog
prog.Counts.* % spike prog -feedback prog -o
prog.opt
As in the previous example, the first compilation
produces an augmented executable image. The instrumented
program is run twice, producing a uniquely
named data file each time. The prof -pixie -update
command merges the two data files and updates the
executable image with the combined information.
The spike -feedback command uses the combined profiling
information to produce the optimized output
file prog.opt. The following example shows how to
optimize a program, prog, by merging profiling
statistics from two instrumented runs of the program,
then applying the spike -fb command to use
the feedback information in the and files: % cc
prog -o prog *.c % pixie -pids prog % prog.pixie
(input set 1) % prog.pixie (input set 2) % prof
-pixie -merge prog.Counts prog prog.Addrs
prog.Counts.* % spike prog -fb prog -o prog.opt
The first compilation produces a normal executable
image. As in the previous example, the instrumented
program is run twice, producing a uniquely named
data file each time.
The prof -pixie -merge command merges the two data
files into one combined prog.Counts file.
The spike -fb command uses the information in
prog.Addrs and prog.Counts to produce the optimized
output file prog.opt.
The method in Example 4 is preferred. You should
use the method in Example 5 only if you cannot compile
with the -feedback option, which uses feedback
information stored in the executable image. The
following example shows the syntax for passing
spike command options to the cc command's -spike
option by using the cc command's -WS option: % cc
-spike -feedback prog -o prog *.c \
-WS,-splitThresh,.999,-noaggressiveAlign
The following example shows how to optimize a program,
prog, using profiles obtained by using the
DCPI profiler: % mkdir db # create profile directory
% dcpid db # start dcpi demon % ./prog
# run your program % dcpiquit # stop dcpi demon %
dcpi2bb -make_bbdb -counts -pm all -conf_low -db db
prog
# store feedback
information in the binary
spike prog -feedback prog # spike your program
utilizing feedback The following example is similar
to the previous one, but it contains three modifications
for DCPI-based prefetching: % mkdir db #
create profile directory % dcpid -vtrace
/usr/lib/dcpi/vp-ldlatency.so db # start dcpi
demon % ./prog # run your program % dcpiquit
# stop dcpi demon % dcpi2bb -make_bbdb -counts -pm
all -conf_low -load_lat -db db prog
# store feedback
information in the binary % spike prog
-dcpi_prefetch -feedback prog
# spike your program
utilizing feedback The following example
demonstrates how to perform stride prefetching:
First, instrument an executable image (prog) for
profiling address strides by the following command:
% pixie -stats dstride prog # Step (a):
instrumentation
This command creates an instrumented program
(prog.pixie). Second, run the instrumented program
with the input intended for training purpose: %
prog.pixie input # Step (b): stride
profiling
This command generates a profile of address
strides, which is stored into the file prog.Counts.
Finally, invoke Spike to insert stride prefetches:
% spike prog -fb prog -stride_prefetch -o prog.pf
# Step (c): prefetch insertion
The output (prog.pf) is a version of the program
with stride prefetches inserted.
Note that it is possible to perform both stride
prefetching and other feedback-directed optimizations
at the same time. To do this, you need to
first collect the feedback information for the
other optimizations and store it into the executable
image using the following sequence: % cc
-feedback prog -o prog *.c % pixie prog %
prog.pixie input % prof -pixie -update prog
prog.counts
Then, you basically repeat Steps (a) to (c) for
stride prefetching, except that you need to turn on
both stride prefetching and other feedback-directed
optimizations in a single spike command: % pixie
-stats dstride prog # same as Step (a) %
prog.pixie input # same as Step
(b) % spike prog -feedback prog -fb prog
-stride_prefetch -o prog.opt_pf
# Step (c) plus other feedbackdirected
optimizations
The output (prog.opt_pf) is a version of the program
with both stride prefetching and other feedback-directed
optimizations.
Spike returns the following status values:
0: Success
Nonzero: Error
cc(1), pixie(1), prof(1)
Programmer's Guide
The spike web page at http://www.tru64unix.com-
paq.com/spike/
The DCPI web page at http://www.tru64unix.compaq.com/dcpi/
spike(1)
[ Back ] |