*nix Documentation Project
·  Home
 +   man pages
·  Linux HOWTOs
·  FreeBSD Tips
·  *niX Forums

  man pages->OpenBSD man pages -> vnode (9)              



NAME    [Toc]    [Back]

     vnode - an overview of vnodes

DESCRIPTION    [Toc]    [Back]

     A vnode is an object in kernel memory that speaks  the  UNIX
file interface
     (open,  read, write, close, readdir, etc.).  Vnodes can represent files,
     directories, FIFOs, domain sockets, block devices, character

     Each  vnode  has  a  set  of methods which start with string
'VOP_'.  These
     methods include VOP_OPEN, VOP_READ,  VOP_WRITE,  VOP_RENAME,
     VOP_MKDIR.   Many of these methods correspond closely to the
     file system call - open, read,  write,  rename,  etc.   Each
file system
     (FFS, NFS, etc.) provides implementations for these methods.

     The Virtual File System (VFS) library maintains  a  pool  of
vnodes.  File
     systems  cannot allocate their own vnodes; they must use the
     provided by the VFS to create and manage vnodes.

   Vnode life cycle    [Toc]    [Back]
     When a client of the VFS requests a new vnode, the vnode allocation code
     can  reuse  an  old  vnode  object that is no longer in use.
Whether a vnode
     is in use is tracked by the vnode reference  count  (v_usecount).  By convention,
  each  open  file handle holds a reference as do VM
objects backed
     by files.  A vnode with a reference count of 1 or more  will
not be de-allocated
 or re-used to point to a different file.  So, if you
want to ensure
 that your vnode doesn't become a different  file  under
you, you better
 be sure you have a reference to it.  A vnode that points
to a valid
     file and has a reference count of 1 or more is  called  "active".

     When  a  vnode's  reference  count drops to zero, it becomes
"inactive", that
     is, a candidate for reuse.  An "inactive" vnode still refers
to a valid
     file and one can try to reactivate it using vget(9) (this is
used a lot
     by caches).

     Before the VFS can reuse an inactive vnode to refer  to  another file, it
     must  clean  all  information pertaining to the old file.  A
cleaned out vnode
 is called a "reclaimed" vnode.

     To support forceable unmounts and the revoke(2) system call,
the VFS may
     "reclaim" a vnode with a positive reference count.  The "reclaimed" vnode
     is given to the dead file system, which returns  errors  for
most operations.
   The reclaimed vnode will not be re-used for another
file until
     its reference count hits zero.

   Vnode pool    [Toc]    [Back]
     The getnewvnode(9) system call allocates a  vnode  from  the
pool, possibly
     reusing  an  "inactive" vnode, and returns it to the caller.
The vnode returned
 has a reference count (v_usecount) of 1.

     The vref(9) call increments the reference count on  the  vnode.  It may only
  be on a vnode with reference count of 1 or greater.  The
vrele(9) and
     vput(9) calls decrement the reference count.   In  addition,
the vput(9)
     call also releases the vnode lock.

     The  vget(9) call, when used on an inactive vnode, will make
the vnode
     "active" by bumping the reference count to one.  When called
on an active
     vnode,  vget increases the reference count by one.  However,
if the vnode
     is being reclaimed concurrently, then vget will fail and return an error.

     The  vgone(9) and vgonel(9) orchestrate the reclamation of a
vnode.  They
     can be called on both active and inactive vnodes.

     When transitioning a vnode to the "reclaimed" state, the VFS
will call
     VOP_RECLAIM(9) method.  File systems use this method to free
any filesystem
 specific data they attached to the vnode.

   Vnode locks    [Toc]    [Back]
     The vnode actually has three different types  of  lock:  the
vnode lock, the
     vnode interlock, and the vnode reclamation lock (VXLOCK).

   The vnode lock    [Toc]    [Back]
     The  vnode lock and its consistent use accomplishes the following:

     +o   It keeps a locked vnode  from  changing  across  certain
pairs of VOP_
         calls,  thus  preserving  cached  data.  For example, it
keeps the directory
 from changing  between  a  VOP_LOOKUP  call  and  a
         VOP_LOOKUP  call makes sure the name doesn't already exist in the directory
 and finds free room in the directory for the new
entry.  The
         VOP_CREATE can then go ahead and create the file without
checking if
         it already exists or looking for free space.

     +o   Some file systems rely on it to  ensure  that  only  one
"thread" at a
         time is calling VOP_ vnode operations on a given file or
         Otherwise, the file system's behavior is undefined.

     +o   On rare occasions, code will hold the vnode lock so that
a series of
         VOP_  operations  occurs as an atomic unit.  (Of course,
this doesn't
         work with network file systems  like  NFSv2  that  don't
have any notion
         of  bundling a bunch of operations into an atomic unit.)

     +o   While the vnode lock is held, the vnode will not be  reclaimed.

     There  is  a  discipline to using the vnode lock.  Some VOP_
operations require
 that the vnode lock is held before  being  called.   A
description of
     this    rather    arcane    locking    discipline    is   in

     The vnode lock is acquired by  calling  vn_lock(9)  and  released by calling

     A  process is allowed to sleep while holding the vnode lock.

     The implementation of the vnode lock is  the  responsibility
of the individual
 file systems.  Not all file systems implement it.

     To  prevent  deadlocks, when acquiring locks on multiple vnodes, the lock
     of parent directory must be acquired before the lock on  the
child directory.

   Vnode interlock    [Toc]    [Back]
     The  vnode interlock (vp->v_interlock) is a spinlock.  It is
useful on
     multi-processor systems for acquiring a quick exclusive lock
on the contents
  of  the  vnode.   It MUST NOT be held while sleeping.
(What fields
     does it cover? What about splbio/interrupt issues?)

     Operations on this lock are a no-op on uniprocessor systems.

   Other Vnode synchronization    [Toc]    [Back]
     The  vnode reclamation lock (VXLOCK) is used to prevent multiple processes
     from entering the vnode reclamation code.  It is  also  used
as a flag to
     indicate  that  reclamation is in progress.  The VXWANT flag
is set by processes
 that wish to be woken up  when  reclamation  is  finished.

     The  vwaitforio(9)  call is used to wait for all outstanding
write I/Os associated
 with a vnode to complete.

   Version number/capability
     The vnode capability, v_id, is a 32-bit  version  number  on
the vnode.  Every
  time a vnode is reassigned to a new file, the vnode capability is
     changed.  This is used by code that wishes to keep  pointers
to vnodes but
     doesn't  want  to hold a reference (e.g., caches).  The code
keeps both a
     vnode * and a copy of the capability.  The  code  can  later
compare the vnode's
  capability  to  its  copy  and see if the vnode still
points to the
     same file.

     Note: for this to work, memory assigned to hold a struct vnode can only
     be  used  for  another  purpose when all pointers to it have
     Since the vnode pool has no way of knowing when all pointers
have disappeared,
 it never frees memory it has allocated for vnodes.

   Vnode fields    [Toc]    [Back]
     Most  of the fields of the vnode structure should be treated
as opaque and
     only manipulated through the proper APIs.  This section  describes the
     fields that are manipulated directly.

     The  v_flag attribute contains random flags related to various functions.
     They are summarized in table ...

     The v_tag attribute indicates what file system the vnode belongs to.
     Very little code actually uses this attribute and its use is
     Programmers should seriously consider using more object-oriented approaches
  (e.g.  function  tables).  There is no safe way of
defining new
     v_tags for loadable file systems.  The  v_tag  attribute  is

     The  v_type  attribute indicates what type of file (e.g. directory, regular,
 FIFO) this vnode is.  This is used by the generic  code
for various
     checks.  For example, the read(2) system call returns an error when a
     read is attempted on a directory.

     The v_data attribute allows a file system to attach a  piece
of file system
 specific memory to the vnode.  This contains information
about the
     file that is specific to the file system.

     The v_numoutput attribute indicates the  number  of  pending
synchronous and
     asynchronous  writes  on  the  vnode.  It does not track the
number of dirty
     buffers attached to the vnode.  The  attribute  is  used  by
code like fsync
     to  wait  for all writes to complete before returning to the
user.  This
     attribute must be manipulated at splbio().

     The v_writecount attribute tracks the number of write  calls
pending on
     the vnode.

   RULES    [Toc]    [Back]
     The  vast majority of vnode functions may not be called from
     context.  The exceptions are bgetvp and brelvp.  The following fields of
     the  vnode  are manipulated at interrupt level: v_numoutput,
     v_dirtyblkhd,  v_cleanblkhd,  v_bioflag,   v_freelist,   and
v_synclist.  Any
     access to these fields should be protected by splbio.

HISTORY    [Toc]    [Back]

     This document first appeared in OpenBSD 2.9.

OpenBSD      3.6                        February     22,     2001
[ Back ]
 Similar pages
Name OS Title
vflush FreeBSD flush vnodes for a mount point
vflush OpenBSD flush vnodes for a mount point
vdevgone OpenBSD revoke all specified minor numbered vnodes for a device
cpuset IRIX overview of cpusets
sh HP-UX overview of various system shells
array_services IRIX overview of array services
syscall OpenBSD system calls overview
perlintro OpenBSD a brief introduction and overview of Perl
IFL IRIX overview of Image Format Library
FRELE OpenBSD an overview of file descriptor handling
Copyright © 2004-2005 DeniX Solutions SRL
newsletter delivery service