array_services(5)					     array_services(5)

NAME    [Toc]    [Back]

     array_services - overview of array	services

DESCRIPTION    [Toc]    [Back]

     Along with	the power and flexibility of clustered systems comes some
     additional	complexity in the area of administering	and managing the array
     as	a whole.  IRIX provides	several	services to help ease this situation.
     Some of these services revolve around the notion of an array session,
     which is a	set of processes, perhaps running on different nodes in	an
     array, that are conceptually related as a single "job".  Additional
     services are provided by the array	services daemon, which knows about the
     configuration of an array and is therefore	able to	provide	functions for
     describing	and administering it.

   ARRAY SESSIONS    [Toc]    [Back]
     A principal use of	an array system	is to run jobs that are	large enough
     to	span two or more machines.  Unfortunately, the mechanisms that are
     typically used to manage multiple related processes (for example: process
     groups, terminal sessions)	are limited in scope to	a single machine.  As
     a result, mundane tasks such as killing a job or accounting for all of
     its resource usage	can become very	difficult when the job runs across
     several machines.	Some means of correlating related processes on
     different machines	is required.  IRIX provides this function with the
     notion of an "array session".

     In	formal terms, an array session is a set	of processes all related to
     each other	by a single unique identifier, the array session handle	(ASH).
     A child process ordinarily	inherits the ASH of its	parent when it is
     created, thus becoming a member of	its parent's array session.  However,
     it	is possible for	a process to leave its parent's	array session and
     start a new one.  This would be done by programs such as login(1) or
     rshd(1M) so that logging in to the	system will effectively	start a	new
     array session.  This is also done by programs like	cron(1M) and su(1M) so
     that work done on behalf of another user will be done in its own array
     session.  When the	last process with a given ASH exits, a session
     accounting	record containing accumulated statistics for all of the
     processes that ran	in the array session is	written	and the	array session
     ceases to exist.

     The array session handle itself is	a 64-bit value.	 By default, a unique,
     increasing	value (similar to a process ID)	is assigned to each new	array
     session as	its handle.  This type of ASH is referred to as	a local	ASH:
     although it is guaranteed to be unique on the local machine, it may also
     be	in use by a different session on another machine in the	same array.
     Because of	this, a	local ASH is not appropriate for identifying multimachine
 jobs.  However, there is a	second type of ASH known as a global
     ASH.  These are assigned by the array services daemon (see	below) and are
     supposed to be unique across the entire array.  By	arranging for the same
     global ASH	to be associated with each process in a	job, it	is possible to
     treat the set of processes	as a single entity, even though	some of	the
     processes may be running on different machines.

     The next trick is "arranging for the same global ASH to be	associated
     with each process in a job".  This	involves several steps.	 First,	each
     machine that is to	run part of the	job must start a new array session to
     contain the related processes.  By	default, this new array	session	will
     only have a local ASH, so it must "upgrade" its handle to a global	ASH.
     If	this is	the first machine to run part of the job, it would need	to
     obtain a new global ASH from the array services daemon (this is done with
     a single library call, asallocash(3X)).  Additional machines that are
     called into service for the job would need	to get a copy of the first
     machine's global ASH; presumably this information would be	passed along
     at	the same time as the rest of the information concerning	the new	job.
     Once an appropriate global	ASH has	been settled upon, it can then be
     assigned to the new array session,	replacing the original local ASH.  The
     process on	each machine that started the new array	session	is now free to
     fork off any number of children to	do the required	work.  These children
     will all have the same ASH	and can	therefore be correlated	with each
     other for administrative tasks such as job	control	or accounting.

   ARRAY SERVICES DAEMON    [Toc]    [Back]
     Although being able to correlate related processes	on different machines
     in	an array is necessary for the stated goal of administering an array in
     a reasonable way, it is not sufficient: something still needs to find all
     of	those related processes	and act	upon them.  That is the	job of the
     array services daemon.

     Each machine in an	array should have an array services daemon running on
     it.  The array services daemon (arrayd) performs several different	tasks:

     - It allocates global array session handles

     - It knows	the current array configuration	and can	provide	that
       information to other commands and programs

     - It can determine	which processes	belong to a particular array session
       and provide that	information to other commands and programs

     - It can forward commands to all of the machines in an array

   Global Array	Session	Handles
     As	mentioned earlier, a global array session handle is important for
     keeping track of jobs that	run on several machines	in an array.  Because
     the array services	daemon knows the configuration of an array, it is
     better suited to providing	a unique global	ASH than the IRIX kernel,
     which necessarily knows only about	the local machine.

     When a program needs to allocate a	global ASH it invokes a	single library
     call, specifying (optionally) which array the ASH is to be	allocated for.
     The library call, which is	part of	libarray (see below), takes care of
     the pragmatic issues of contacting	and communicating with the local array
     services daemon.  The resulting global ASH	can then be passed to the
     setash(2) system call.  Note that while anybody can allocate a global
     ASH, only a process with root privileges can actually change its ASH

     using setash.

     The global	ASH itself is an "opaque" value: it does not necessarily have
     any specific information embedded in it, other than to distinguish	it
     from a local ASH (a library function is provided to make this
     distinction).  Nevertheless, the identity of the specific machine that
     creates a global ASH and the array	for which it is	intended may play some
     role in the generation of the ASH value itself.  System administrators
     may specify particular values to be used for this purpose if desired.

   Array Configuration Database    [Toc]    [Back]
     Each array	services daemon	has knowledge about one	or more	arrays and the
     machines that make	up each	of them.  This information can be provided to
     other programs and	commands with straightforward library calls in
     libarray.	Ideally, this should make it unnecessary for other arrayoriented
 programs to maintain their own separate array configuration

     An	array services daemon obtains its configuration	information from a
     configuration file	located	in its local filespace.	 Each daemon has its
     own configuration file which must be synchronized by the system
     administrator with	configuration files on other machines in the array(s).

   Array Session Information    [Toc]    [Back]
     To	take advantage of array	sessions, it is	necessary to be	able to
     enumerate the processes that are contained	in a given array session.  For
     certain applications (monitor programs for	example) it may	also be	useful
     to	enumerate ALL of the known array sessions.  The	array services daemon
     can obtain	both types of information and provide it to other programs via
     libarray functions.

   Command Forwarding    [Toc]    [Back]
     Command forwarding	pulls all of the other array services together:	it
     allows a user on one machine to issue a single command and	have it
     executed on all of	the machines in	an array, perhaps only affecting a
     particular	array session.	With the appropriate setup, this could be used
     for such tasks as killing a runaway job or	shutting down an entire	array.

     Users use a simple	client program (array(1)) to specify the command they
     want to execute, any arguments it may require and the array they want to
     execute it	on.  Such an array command might look like this:

	  array	-a DevArray killash 13543423

     This example says "execute	the command 'killash 13543423' on the machines
     in	the array 'DevArray'".	The command "killash" is not necessarily an
     actual program on any machine in the array; instead it refers to an entry
     in	each machine's array configuration file.  The entry itself specifies
     which program to execute, which arguments should be passed	to it, which
     user/group/project	the command should be executed under, etc.  This
     allows each machine in an array to	handle a particular command
     differently, or not handle	it at all.

     The "array" program itself	is fairly basic: it simply passes the user's
     command to	the local array	services daemon.  The local array services
     daemon forwards the command to each machine in the	specified array, then
     gathers the results which are then	passed back to the "array" program and
     then the user.


     In	general, users should never have any direct interaction	with the array
     services daemon.  Instead,	all interaction	with the array services	daemon
     is	done through the array services	library, libarray.  libarray provides
     functions for dealing with	global ASH's, describing the current array
     configuration, and	executing array	commands.

     There are a number	of libarray functions, all of which are	documented in
     chapter 3X	man pages.  Some of the	libarray functions include:

	 ASH Functions
	     asallocash	      -	Allocates a global ASH
	     asashisglobal    -	Indicates whether an ASH is global or local
	     aslistashs_array -	Returns	all global ASH's in specified array

	 Configuration Functions
	     aslistarrays     -	Returns	info on	all known arrays
	     aslistmachines   -	Returns	info on	all machines in	specified array

	 Command Forwarding
	     ascommand	      -	Execute	an array command

SEE ALSO    [Toc]    [Back]

     array(1), arrayd(1M), newsess(1), asallocash(3X), asashisglobal(3X),
     ascommand(3X), aslistarrays(3X), aslistashs_array(3X),
     aslistashs_server(3X), aslistmachines(3X),	arrayd.conf(4),

