vxio(7) VxVM 3.5 vxio(7)
1 Jun 2002
NAME [Toc] [Back]
vxio - VERITAS Volume Manager virtual disk devices
DESCRIPTION [Toc] [Back]
Volume devices are the virtual disk devices in VERITAS Volume Manager
(VxVM). The volume devices support a virtual disk access method with
disk mirroring and disk striping. A volume is a logical entity
composed of one or more plexes. A read can be satisfied from any
plex, while a write is directed to all plexes. The virtual disk
devices have a wide variety of behaviors, which are programmable
through the /dev/vx/config device. For volume devices, both blockand
character-special devices are implemented.
Each plex in the volume is a copy of the volume address space. The
plex has subdisks associated with it. These subdisks provide backup
storage for the volume address space.
It is possible to create a sparse plex, which is a plex without backup
storage for some of the volume address space. The areas of a sparse
plex that do not have a backing subdisk are called holes. An attempt
to read a hole in a sparse plex fails. If there are other plexes that
have backup storage for the read, then one of those plexes is read.
Otherwise, the read fails. A write to a hole in a sparse plex is
considered a success even though the data can't be read back.
In addition, a plex may be designated as a logging plex. This means
that a log of blocks that are in transition will be kept, which
enables fast recovery after a system failure. This feature is known
as DRL, or dirty region logging. The log for each plex consists of a
specially designated subdisk that is not part of the normal plex
address space.
IOCTLS [Toc] [Back]
The ioctl commands supported by the volume virtual disk device
interface are discussed later in this section. The format for calling
each ioctl command is:
#include <sys/types.h>
#include <sys/volclient.h>
struct tag arg;
int ioctl (int fd, int cmd, struct tag *arg);
The value of cmd is the ioctl command code and arg is usually a
pointer to a structure containing the arguments that need to be passed
to the kernel.
The return value for all these ioctls, with some exceptions, is zero
if the command was successful, and -1 if it was rejected. If the
return value is -1, then errno is set to indicate the cause of the
error.
- 1 - Formatted: January 24, 2005
vxio(7) VxVM 3.5 vxio(7)
1 Jun 2002
The following ioctl commands are supported:
GET_DAEMON [Toc] [Back]
This ioctl returns the pid of the process with the /dev/vx/config
device open, or 0 if it is closed. The value of arg is undefined
and should be NULL.
GET_VOLINFO [Toc] [Back]
This command accepts a pointer to a volinfo structure as an
argument. It fills in the volinfo structure with the
corresponding values from the kernel. The members of a volinfo
structure are:
long version; /*kernel version number*/
long max_volprivmem; /*max size of volprivmem area*/
major_t volbmajor; /*volume blk dev major number*/
major_t volcmajor; /*volume char dev major number*/
major_t plexmajor; /*plex device major number*/
long maxvol; /*max # of volumes supported*/
long maxplex; /*max # of associated plexes*/
long plexnum; /*max plexes per volume*/
long sdnum; /*max subdisks per plex*/
long max_ioctl; /*max size of ioctl data*/
long max_specio; /*max size of ioctl I/O op*/
long max_io; /*max size of I/O operation*/
long vol_maxkiocount; /*max # top level I/Os allowed*/
long dflt_iodelay; /*default I/O delay for utils*/
long max_parallelio; /*max # voldios allowed*/
long voldrl_min_regionsz;/*min DRL region size*/
long voldrl_max_drtregs; /*max # of DRL dirty regions*/
long vol_is_root; /*if set, root is volume*/
long mvrmaxround; /*max round-robin region size*/
long prom_version; /*PROM version of the system*/
long vol_maxstablebufsize;/*max size of copy buffer*/
size_t voliot_iobuf_limit; /*max total I/O trace buf spc*/
size_t voliot_iobuf_max; /*max size of I/O trace buffer*/
size_t voliot_iobuf_default;/*default I/O trace buf size*/
size_t voliot_errbuf_default;/*default error trace buf size*/
long voliot_max_open; /*max # of trace channels*/
size_t vol_checkpt_default;/*default checkpoint size*/
long volraid_rsrtransmax;/*max # of transient RSRs*/
PLEX_DETACH [Toc] [Back]
This command is used to force a plex to be detached from a
volume. The name of the plex is passed in as an argument. The
volume is the volume device against which the ioctl is being
performed.
VOL_LOG_WRITE [Toc] [Back]
This command forces a dirty region log to be flushed to disk.
This is used by the vxconfigd process to flush an initial log to
- 2 - Formatted: January 24, 2005
vxio(7) VxVM 3.5 vxio(7)
1 Jun 2002
disk before starting the volume.
VOL_READ, VOL_WRITE
These commands provide a mechanism by which I/O can be issued to
volumes larger than 2 gigabytes in length. Current UNIX read and
write system calls on 32-bit processors limit sizes to 2
gigabytes because of the signed byte-offset value used to perform
the I/O. These ioctl commands provide a method of providing
sector offsets to an I/O and raise the limit to one sector less
than 1024 gigabytes.
The required I/O is identified to the command by the use of a
vol_rdwr structure containing the following:
ulong_t vrw_flags; /*flags*/
voff_t vrw_off; /*offset in volume (sectors)*/
size_t vrw_size; /*number of sectors to Xfer*/
caddr_t vrw_addr; /*user address for Xfer*/
The vrw_flags field is currently unused; other fields are
explained in the comments.
VOLUME-SPECIAL IOCTLS [Toc] [Back]
The ATOMIC_COPY, VERIFY_READ, and VERIFY_WRITE ioctls perform special
I/O operations against the volume. They use the vol_io structure to
initiate I/O requests and receive the status information back. The
members of the vol_io structure are:
voff_t vi_offset; /*0x00 offset on plex*/
size_t vi_len; /*0x04 amount of data to read/write*/
caddr_t vi_buf; /*0x08 ptr to buffer*/
size_t vi_nsrcplex; /*0x0c number of source plexes*/
size_t vi_ndestplex; /*0x10 number of destination plexes*/
struct plx_ent *vi_plexptr;/*0x14 ptr to array of plex entries*/
ulong_t vi_flag; /*0x18 flags associated with op*/
The members of the plx_ent structure are:
char pe_name[NAME_SZ]; /*name of plex*/
int pe_errno; /*error number against plex*/
The vi_offset value specifies the sector offset of the I/O within the
volume. It must be within the address range of the volume. Also, the
entire range of the I/O from vi_offset to vi_offset + vi_len must be
within the address range of the volume.
The vi_len field specifies the length of the I/O in sectors. It must
be a between 0 and 120 sectors (VOL_MAXSPECIALIO).
The vi_buf field is a pointer to a buffer of vi_len sectors. The
VERIFY_WRITE ioctl writes the data stored in this buffer.
- 3 - Formatted: January 24, 2005
vxio(7) VxVM 3.5 vxio(7)
1 Jun 2002
The vi_nsrcplex field is the number of source plexes available for the
operation and the vi_ndestplex field is the number of destination
plexes available for the operation. The vi_nsrcplex and vi_ndestplex
values must be between 0 and 8 (PLEX_NUM).
The vi_plexptr is a pointer to an array of plx_ent structures. The
first vi_nsrcplex entries in the array are source plexes. The pe_name
contains the name of the plex. If the name of the first source entry
is the null string, then the kernel selects all plexes available for
reading as part of the volume and fills in the pe_name fields.
After the source plexes, the next vi_ndestplex entries are the
destination plexes. If the name of the first destination entry is the
null string, then the kernel selects all plexes available for writing
as part of the volume and fills in the pe_name fields.
After the I/O operations are performed, the plx_ent structures are
copied back to the user. If the kernel selected the plexes, the names
of the selected plexes are in the pe_name fields. The status of the
operations on each plex are stored in the pe_errno field of the
plx_ent structure for that plex.
The pe_errno field is 0 if the operation succeeded against the plex.
If pe_errno isn't 0, then the error code indicates what happened to
the plex. The possible values for pe_errno are:
EACCES The specified plex is in the disabled state, so no I/O
can be performed against it.
EFAULT A source plex is sparse and doesn't have blocks that
map the entire I/O request.
EIO A read I/O error was returned against a source plex.
ENOENT The specified plex isn't associated with the volume the
ioctl was issued against.
ENXIO An error was detected in the operation, so no I/O
operation was attempted to the plex. If one plx_ent
structure in a list contains a bad name, then no I/O is
done. All plx_ent structures with a valid name have
their pe_errno set to ENXIO to indicate no I/O was
attempted.
EROFS A write I/O error was returned against a destination
plex.
ESRCH The VERIFY_READ and VERIFY_WRITE operations compare the
data from different plexes against each other to verify
the consistency. If the comparison detects an error,
the plex that was read first is considered correct. An
- 4 - Formatted: January 24, 2005
vxio(7) VxVM 3.5 vxio(7)
1 Jun 2002
ESRCH error is returned against the plex that was read
second to indicate it contains bad data.
For the ATOMIC_COPY, VERIFY_READ, and VERIFY_WRITE ioctls, if the
entire operation is a success, then a 0 is returned. If there is a
fatal error, a -1 is returned and the external variable errno
indicates the reason for failure. If a 1 is returned, there was some
sort of failure and the net results must be determined by examining
the pe_errno fields of all the plexes.
The ioctls that do volume-special I/O are:
ATOMIC_COPY [Toc] [Back]
This ioctl takes a pointer to a vol_io structure as an argument.
It reads vi_len sectors of data, at offset vi_offset, from one of
the plexes specified by the first vi_nsrcplex plex entries into a
buffer. Then it writes the contents of the buffer onto all the
plexes specified by the next vi_ndestplex plex entries at the
same offset. This entire operation is atomic with respect to the
I/O stream of the volume. The vi_buf field is unused and should
be NULL.
If the first source plex entry has a null string for the name,
the kernel selects from any plex of the volume that is enabled
for read access. The names of any selected plexes are copied
into the appropriate plex entries.
If the first destination plex entry has a null string for the
name, the kernel will write to all plexes of the volume that are
enabled for write access. The names of selected plexes are
copied into the appropriate plex entries.
When the list of source plexes has been compiled, the kernel
tries to read each plex in order. A plex can't be read if it
doesn't have backup storage covering the entire operation. Once
a plex has been successfully read, all the destination plexes are
written. The writes can succeed even if the destination plex is
sparse and doesn't have backup storage to cover the entire write.
If the ioctl returns a value of -1, then some error has occurred
which prevented the ATOMIC_COPY from working. If the return
value is 0, then everything worked fine. If the return value is
1, then the pe_errno field must be examined to determine the
errors on each individual plex. The status of the overall
operation depends on these individual errors.
VERIFY_READ [Toc] [Back]
This command accepts a pointer to a vol_io structure as an
argument. It reads vi_len sectors, from offset vi_offset, on the
first vi_nsrcplex plexes specified by the vi_plexptr array of
plex entries. It compares the data from each plex. If any plex
- 5 - Formatted: January 24, 2005
vxio(7) VxVM 3.5 vxio(7)
1 Jun 2002
is different from the previous plexes read, the pe_errno value
for that plex is set to ESRCH. This entire operation is atomic
with respect to the I/O stream of the volume.
The vi_ndestplex field must be 0. If the vi_buf field is not
NULL, then the data read from the plexes not marked with ESRCH is
copied into the buffer specified by vi_buf.
As each plex is read, any data that has already been read is
compared against the previous reads. If all the data for the
plex passes, then any data that was read for the first time is
copied into the comparison buffer. This allows VERIFY_READ
operations to work if the volume contains sparse plexes. The
data not represented by backup storage is not compared against
anything.
VERIFY_WRITE [Toc] [Back]
This command accepts a pointer to a vol_io structure as an
argument. It writes vi_len sectors to offset vi_offset on the
first vi_ndestplex plexes specified by the vi_plexptr array of
plex entries. The data to be written is stored in the buffer
pointed to by vi_buf. Then, the data is read back from each plex
that was successfully written and is compared against the data
written. If the data from any plex doesn't match the data
written, the pe_errno value for the plex is set to ESRCH. This
entire operation is atomic with respect to the I/O stream of the
volume.
The vi_nsrcplex field must be 0.
As each plex is read, any data that doesn't have backup storage
on that plex is filled in from the write buffer. This allows
VERIFY_WRITE operations to work if the volume contains sparse
plexes. The data not represented by backup storage will always
succeed.
VOL_LOG_WRITE [Toc] [Back]
This command takes no argument, and causes the log for a volume
to be written to disk immediately. This command is useful for
making sure that the on-disk images of the log have been written.
This command returns -1 with errno set to EINVAL if the
specified volume does not have logging enabled.
PLEX_DETACH [Toc] [Back]
This command allows an enabled plex to be detached. The argument
is the name of the plex to detach.
DIAGNOSTICS [Toc] [Back]
The following errors are returned by volume virtual disk device
interfaces:
- 6 - Formatted: January 24, 2005
vxio(7) VxVM 3.5 vxio(7)
1 Jun 2002
EAGAIN A needed kernel resource couldn't be obtained.
EBADF An attempt was made to write a volume that wasn't
opened for writing or read a volume that wasn't open
for reading.
EFAULT A pointer passed to the kernel was invalid, causing a
bad memory reference.
EINVAL Invalid data was passed to the kernel. Some field in a
vol_io structure failed a sanity check.
EIO A physical I/O error occurred during an operation. If
this error is returned, the driver tried an I/O and it
failed.
EMFILE The kernel was asked to supply a list of destination
plexes for an ATOMIC_COPY or VERIFY_WRITE ioctl. If
more than vi_ndestplex enabled plexes available in
write mode are found, an EMFILE error is returned.
ENFILE The kernel was asked to supply a list of source plexes
for an ATOMIC_COPY or VERIFY_READ ioctl. If more than
vi_nsrcplex enabled plexes available in read mode are
found, an ENFILE error is returned.
ENOENT An object named in GET_VOL_STATS ioctl was not
associated with the volume.
ENOENT The kernel was asked to supply a list of destination
plexes for an ATOMIC_COPY or VERIFY_WRITE ioctl. If no
enabled plexes available in write mode are found, an
ENOENT error is returned.
ENXIO A validation error occurred during an I/O operation.
If this error is returned, the driver attempted no I/O.
ESRCH The kernel was asked to supply a list of source plexes
for an ATOMIC_COPY or VERIFY_READ ioctl. If no enabled
plexes available in read mode are found, an ESRCH error
is returned.
FILES [Toc] [Back]
/dev/vx/dsk/*... Volume block device files.
/dev/vx/rdsk/*... Volume character (raw) device files.
SEE ALSO [Toc] [Back]
ioctl(2), vxconfig(7), vxiod(7), vxtrace(7)
- 7 - Formatted: January 24, 2005 [ Back ] |