sfkeywords - soundfile keywords used in sfinfo, sfplay, and sfconvert
Many of the sf programs require descriptions of soundfile formats. These
descriptions are always specified using the same set of keywords, which
are given one after the other on the command line, separated by spaces.
byteorder e endian (e is big or little)
channels n n-channel file (1 or 2)
rate r sampling rate r, in Hertz
format f file format f (see below)
integer n s n-bit integer file, where s is:
2scomp: 2's complement signed data
unsigned: unsigned data
float m floating point file, maxamp m (usually 1.0)
mulaw mulaw file (8-bit only)
dataoff o data starts at byte offset o (for raw data)
The keywords do not need to be spelled out; only the first character, or
the first 2 characters for 'float' and 'format', is required.
These keywords are used in situations where information about a soundfile
format is needed, such as in sfconvert:
sfconvert in.snd out.aif format aiff integer 16 2 chan 2
Specifies a stereo, 16-bit (2's complement signed) integer aiff file.
Note that some keywords, such as 'integer', require parameters. These
parameters can also be abbreviated, except for the parameter of the
The 'format' keyword specifies the file format. Currently supported file
aiff Audio Interchange File Format
aifc AIFF-C File Format
next NeXT/Sun Format
wave MS RIFF WAVE Format
The 'channels' and 'rate' keywords are fairly straightforward. They
simply specify how many interleaved channels of data the soundfile has
and what sampling rate the data is meant to be played at (in Hertz).
Here are some notes on sampling rates:
Some files, particularly mulaw-encoded 8-bit NeXT soundfiles, have a
sampling rate of 8012.8210513 Hz, which is often abbreviated to 8012.82
Hz or 8.013 kHz. When converting another file to a file with this
sampling rate, you should be sure to specify the full-precision rate.
Otherwise some programs may not recognize the file as playable. WAVE
files store sampling rate as an integral number of samples per second,
therefore they cannot support this sampling rate.
The sfconvert and soundfiler utilities will perform high-quality linear
phase sampling rate conversion between the standard rates 8000, 11025,
16000, 22050, 32000, 44100, and 48000 Hz. For conversions where the
source or target rate is not one of these standard rates, sfconvert and
soundfiler use a lower-quality algorithm, and issue a warning to this
effect. For these lower-quality conversions, some loss of quality is
likely, and audible artifacts may occur in the output sound, especially
on conversions from a higher to a lower sampling rate. This lower
quality algorithm, which was present in earlier releases, uses thirdorder
polynomial interpolation and does marginal anti-aliasing. A highquality
algorithm capable of conversion between arbitrary pairs of
sampling rates is under development.
In order to allow high-quality rate conversion in fairly common cases, if
you attempt to convert an 8012.8210513 Hz soundfile to a soundfile with
any standard rate except 8000 Hz, sfconvert and soundfiler will assume
the input rate is 8000 Hz and perform the conversion, again issuing a
warning to this effect. If the -0.16 % shift in pitch (less than three
hundredths of a semitone) is not acceptable, you can first convert the
8012.8210513 Hz soundfile into a 8000 Hz soundfile and then convert the
8000 Hz soundfile to another standard rate, as in the following:
sfconvert in.aiff temp.aiff rate 8000
sfconvert temp.aiff out.aiff rate 16000
In this case sfconvert and soundfiler will use the older algorithm, which
is of acceptable quality for small changes in sampling rate, to do the
first conversion, and the new algorithm to do the second conversion with
the best quality.
The dual of the previous conversion is possible with a similar procedure.
You may convert from any standard rate to 8012.8210513 Hz by first
converting to 8000 Hz, and then to 8012.8210513 Hz:
sfconvert in.aiff temp.aiff rate 8000
sfconvert temp.aiff out.aiff rate 8012.8210513
The 'integer', 'float', and 'mulaw' keywords are mutually exclusive
(although no error will be reported if you use more than 1). Each
specifies the encoding format of the actual samples themselves:
- an 'integer' soundfile stores sound information as simple unsigned or
2's complement 1-32 bit integers. In the signed case, 0 is the zero
signal level. In the unsigned case, (2^b)/2 is the zero signal level,
where b is the number of bits per integer.
- a 'mulaw' soundfile, which for these programs must be in 8-bit format,
stores companded 13-bit sample values in an 8-bit, unsigned-like format.
If you play a mulaw file using sfplay, its samples are automatically
converted to 16-bit samples which the audio hardware can output.
- a 'float' soundfile consists of IEEE standard floating point numbers.
Generally, -1.0 represents full negative amplitude and 1.0 represents
full positive amplitude, but it is quite possible to generate a soundfile
with sample values of magnitude greater than 1.0. For this reason, the
'float' keyword takes an argument as to what value should be treated as
full maximum amplitude. This is usually 1.0. If you play a floating
point file using sfplay, its sample values are automatically scaled based
on a 1.0 maxamp and converted to 24-bit integers which the audio hardware
When converting floating point data to integer data and vice versa, the
sf programs always assume that the highest positive value ((2^b)/2-1 for
b-bit 2's complement integers) maps to the floating point maximum
amplitude, usually 1.0. For example, when converting 16-bit 2's
complement integers to floats of maximum amplitude 1.0, 32767 will map to
+1.0, and -32767 will map to -1.0. This was done so that it is possible
to convert a floating point file to an integer file without clipping a
value off the positive end of the integral range. This means that when
converting ints to floats, it is possible that there will be one value in
the output file that is less than -maxamp where maxamp is the maximum
amplitude specified after the 'float' keyword. If this is a problem, use
a slightly different maximum amplitude which puts all output values
inside the actual desired maximum amplitude.
The 'byteorder' keyword specifies the byte ordering (endian) of the data.
This only applies to > 8 bit data, and is currently only consulted for
integer data. Integer data can be big endian, meaning it conforms to SGI
MIPS / Motorola byte ordering, or it can be little endian, meaning it
conforms to Intel byte ordering. All formats supported by the sf
programs use big endian except WAVE. Any >8 bit raw file transferred
to/from a PC should be converted to/from little endian (respectively).
For UNIX and Macintosh (t.m.) files, big endian data is almost always
desired, and it is the default. Note that little endian floating point
representations are currently not supported. In the soundfiler program,
big endian is always assumed for raw data, AIFF, and AIFF-C, and little
endian is assumed for WAVE.
The 'dataoff' keyword is used only when specifying the format of raw
data. This feature can be useful if you have a file which contains some
sound data that starts somewhere in the middle of the file. The offset
is given in bytes from the beginning of the file.
The 'dataoff' keyword can be used to convert or play a soundfile in a
format that the sf programs do not recognize, if the offset of the sound
data can be determined. It would then be possible to convert the file to
an aiff or other file which is more easily manipulated on Silicon
Some keywords only make sense in certain contexts:
- 'channels', 'rate', 'integer', 'float', 'mulaw' can be used anywhere.
- 'format' does not make sense when describing the format of raw
(headerless) data. Its purpose is to specify which type of header (aiff,
next, wave, etc.) to format the file with.
- 'dataoff' only makes sense when describing raw data, since the offset
of the sound data is known for soundfiles which have headers.
See the above discussion about rate conversion for an important note
about conversion to/from a nonstandard rate (standard rates are those
which appear on the Audio Control Panel).
Note that no dithering is done on conversions from integers of higher
resolution to lower resolution. This will be amended in a future
There should be a 'datasize' keyword to use with 'dataoff' when
converting a soundfile of an unsupported format to a playable file. This
is coming. Currently sfconvert assumes that the sound data continues to
the end of the file.
Silicon Graphics Inc.; Apple Computer, Inc. for AIFF code.
intro(3a) for more about the audio library. sfplay(1), sfinfo(1),
PPPPaaaaggggeeee 4444 [ Back ]