.fp 5 CW
.de L \" literal font
.ft 5
.if !\\$1 \&\\$1 \\$2 \\$3 \\$4 \\$5 \\$6 \f1
..
.de LR
.}S 5 1 \& "\\$1" "\\$2" "\\$3" "\\$4" "\\$5" "\\$6"
..
.de RL
.}S 1 5 \& "\\$1" "\\$2" "\\$3" "\\$4" "\\$5" "\\$6"
..
.de EX \" start example
.ta 1i 2i 3i 4i 5i 6i
.PP
.RS
.PD 0
.ft 5
.nf
..
.de EE \" end example
.fi
.ft
.PD
.RE
.PP
..
.TH COSHELL 1
.SH NAME \" @(#)coshell.1 (gsf@research.att.com) 10/17/93
coshell \- network shell coprocess server
.SH SYNOPSIS
coshell \+
[
.IR info " ..."
]
.br
coshell \-
.br
coshell
.RI \- op
[
.IR arg " ..."
]
.SH DESCRIPTION
.I coshell
is a local network shell coprocess server for programs using
.IR coshell (3).
There is one
.I coshell
server per user.
This server runs as a daemon on the user's home host,
and only processes running on the home host have access to the server.
The server controls a background
.IR ksh (1)
shell process, initiated by
.IR rsh (1),
on each of the connected hosts.
The environment of the local host shell is inherited from the server
whereas the environment of remote shells is initialized by
.B .profile
and
.BR $ENV .
The shells run with the
.I ksh
.B bgnice
and
.B monitor
options on.
.PP
Job requests are accepted from user processes on the local host
and are executed on the connected hosts.
.BR stdout ,
.BR stderr ,
.BR FPATH ,
.BR NPROC
(see ENVIRONMENT),
.BR PWD ,
.BR PATH ,
.BR VPATH ,
.BR vpath ,
.B umask
and the environment variables listed in
.B COEXPORT
(see ENVIRONMENT)
are set to match requesting user values.
.B stdin
is set to
.LR /dev/null ;
.I coshell
does not directly support interactive jobs.
Job scheduling is based on load and idle time information generated by the
.IR ss (1)
system status daemon.
This information is updated every 60 seconds on average.
.PP
The server is started by running
.IR "coshell +" .
The command exits after a child server process is forked in the background.
The optional
.I info
arguments name files containing local network host information
which may be generated from two shell scripts
.I genlocal
and
.I genshare
under the subdirectory bin of the installation root directory.
If no files are specified then the default
.I local
is used.
The local network is comprised of hosts sharing the same file name space.
.PP
Attributes used by
.I coshell
can be categorized
into two types, global and host-specific.
The
.I global
attributes control
.I coshell
and are not associated with any particular host.
Attribute value pairs, not including readonly ones, may be specified in the
local network host information files,
in
.B COATTRIBUTES
(see ENVIRONMENT)
or may be set/added using
.I coshell -a,
and may be referenced in an expression in
.B COATTRIBUTES.
Attribute names must match [a-zA-Z_][a-zA-Z_0-9]*.
In the following description on these attributes,
.I host
may be an actual host name or a comma separated list of attribute value pairs
specified in
.B COATTRIBUTES.
.PP
The attributes used by
.I coshell
are:
.TP
.BI auto =n
.B auto=1
adds the host to the automatic selection pool.
Hosts named in the
.B file
global-attribute have
.B auto=1
by default.
.TP
.BI bias =n.m
The host scheduling bias.
Hosts with a high bias are (linearly) least likely to be scheduled for job
execution.
The default bias is
.BR 1.00 .
.TP
.BI busy =time
(global attribute)
The grace period for jobs running on busy hosts.
A host is busy when its
.B idle
attribute is non-zero and its minimum logged in
user idle time is less than the value of
.BR busy .
For a job running on an idle host,
.B busy
is the maximum amount of time the job may run after the host becomes
busy. If the job does not finish within
.B busy,
the SIGSTOP signal is sent to the job and the job stops. When
the host idle time exceeds
.BR busy ,
the SIGCONT signal is sent to the job and the job resumes.
The meaningful unit of
.I time
may be m(inute) and h(our).
The default is
.BR busy=2m .
.TP
.BI cpu =n
The number of cpus on the host.
The default is
.BR 1 .
.TP
.BI file =path
(global attribute)
Names a file containing default attributes for machines on the local network.
If no directory components are specified then the subdirectory
.B share/lib/cs
of the installation root directory is searched.
The default attribute file for the local network is
.BR share/lib/cs/local .
.TP
.BI idle =time
The minimum logged in user idle time before jobs will be scheduled on the host.
The meaningful unit of
.I time
may be m(inute) and h(our).
The default is
.BR 0 ,
meaning no idle time restrictions.
.B idle
is usually
.B 15m
for workstations and is not specified (i.e., always
available) for compute servers.
.TP
.BI label =string
.I string
labels either the current
.I coshell
connection (via
.I coopen
in
.IR coshell (3))
or the current job (via
.I coexec
in
.IR coshell (3)).
Labels are displayed but are otherwise ignored.
.TP
.BI load =n.m
A readonly attribute that evaluates to the host load average.
.TP
.BI maxload =n.m
(global attribute)
The maximum host load average. No job will be
scheduled on a host with load average >=
.B maxload.
The default
.B maxload=0
means no load average limit.
.TP
.BI name =host
The host name in the local domain (i.e., no .'s in the name).
The
.B name=
may be omitted.
In a host selection context:
.I host
may be a
.IR sh (1)
file match pattern;
.B \-
matches any host;
.B local
matches the local host.
.TP
.BI open =fd
A readonly attribute that evaluates to
.B 0
if the host shell is closed,
.B <0
if the host shell is being opened, and
.B >0
if the host shell is open.
.TP
.BI percpu =n
(global attribute)
The maximum number of concurrent jobs on each cpu. The default is
.BR 3 .
.TP
.BI perserver =n
(global attribute)
The maximum number of concurrent jobs run by
.I
coshell.
.B perserver
has an upper limit that is silently enforced.
The limit is the half of the number of file descriptors allowed.
.BR perserver=0
queues jobs until
.B perserver>0.
.TP
.BI peruser =n
(global attribute)
The maximum number of concurrent jobs per user connection to
.I
coshell. The default is
.BR 12 .
.TP
.BI pool =n
(global attribute)
The number of cpus in the processor pool.
.TP
.BI rating =n.m
The host cpu rating, usually in mips relative to the other hosts
on the local network. This is usually the observed rating rather than
the one in the vendor's advertisements.
.TP
.BI type =string
The host type that differentiates different processor types, usually
related to the object and executable attributes.
The default type is
.BR * .
.TP
.BI up =n
A readonly attribute that evaluates to the number of seconds the host has
been up.
If
.I n
is less than 0 then it is the number of seconds the host has been down.
.PP
Other user-defined attributes may be specified.
They may be referenced in
.B COATTRIBUTES
expressions, but are otherwise ignored by
.IR coshell .
.PP
.L "coshell \- "
opens an interactive connection to the running server.
The commands are:
.TP
.BI a " host[,attributes ...]"
Set or add attributes for the named hosts.
.TP
.BI c " host ..."
Close the shell connections to the named hosts.
The hosts are also removed from the automatic selection pool.
.TP
\fBd\fP [\fIlevel\fP]
Set the server
.B stderr
to the
.B stderr
of the calling process.
If
.I level
is specified then the server debug trace level is set to
.RI \- level .
Higher debug levels produce more output on
.BR stderr .
.TP
\fBf\fP [\fIfd\fP]
This is a debugging option and may not be present in all
implementations.
If
.I fd
is specified then close the internal server file descriptor
.IR fd ,
otherwise list the status of all open file descriptors in the server.
.TP
.B g
List global state.
.TP
.B h
List command help.
.TP
.B j
List the status of all jobs.
The status fields are:
.RS
.PD 0
.TP .6i
.B JOB
The id assigned to the job by the server.
This number may be used as an argument to the
.B k
command.
.TP .6i
.B USR
The id assigned to the requesting user by the server.
.TP .6i
.B RID
The id assigned to the job by the requesting user.
.TP .6i
.B PID
The job process id,
.B QUEUE
if the shell is in the process of opening,
.B START
if the PID has not been determined yet, and
.B WARPED
if the job completed before its PID was determined.
.TP .6i
.B TIME
The elapsed time since the job started.
.B *
follows the time if the job is about to terminate.
.TP .6i
.B HOST
The host where the job is running.
The most recent signal sent to the job follows the host name.
.TP .6i
.B LABEL
The label assigned to the job by the requesting user.
.PD
.RE
.TP
\fBk\fP [ \fBc\fP | \fBk\fP | \fBs\fP | \fBt\fP ] \fIjob\fP
Kill the job with the server JOB id
.IR job .
If no argument is specified then the
.B SIGTERM
signal is sent to the job.
.B c
sends
.BR SIGCONT ,
.B k
sends
.BR SIGKILL ,
.B s
sends
.BR SIGSTOP ,
and
.B t
sends
.BR SIGSTERM .
.TP
.BI l " expr"
List all host names matching the attribute expression
.IR expr .
The names are sorted in scheduling rank order from best to worst.
If
.BI pool =n
is specified in
.I expr
then only the first
.I n
names (after sorting) are listed.
.TP
.BI o " host ..."
Open a shell connection to the named hosts.
.TP
.B q
Quit the interactive connection.
.TP
.B Q
Kill the server and quit the interactive connection.
.TP
\fBr\fP \fIhost\fP [ \fIcommand\fP ]
Run
.I command
on
.IR host .
.I host
may be an attribute expression.
If
.I command
is omitted then
.IR hostname (1)
is used.
.TP
\fBs\fP [ \fBa\fP | \fBe\fP | \fBl\fP | \fBo\fP | \fBp\fP | \fBs\fP | \fBt\fP ]
List the shell connection status.
There is at most one shell connection per host.
If no argument is specified then only open connections are listed.
.B a
lists the attributes for all shells,
.B e
lists all shells,
.B l
lists all shells in the processor pool,
.B o
lists all open shells,
.B p
lists the process id of all open shells,
.B s
lists the shell scheduling status (primarily for debugging),
and
.B t
lists all open shells sorted by the recent job activities running on each
host.
.PP
The status fields for \fBse\fP and \fBsl\fP are:
.RS
.PD 0
.TP .6i
.B CON
The id assigned to the open shell by the server,
.B \@
if the shell is not open and is not in the processor pool,
.B \-
if the shell is not open, and
.B +
if an open is in progress.
.TP .6i
.B JOBS
The number of jobs currently running on the host.
.B *
follows the number if any of the jobs are queued pending the completion of an
open in progress.
.TP .6i
.B TOTAL
The total number of jobs run on the host.
.TP .6i
.B USER
The accumulated user time
.RI ( times (2) )
of all jobs on the host.
.TP .6i
.B SYS
The accumulated sys time
.RI ( times (2) )
of all jobs on the host.
.TP .6i
.B IDLE
The elapsed time since the most recent logged in user activity.
.B *
follows the time if the host does not meet the processor pool
idle time requirements.
.TP .6i
.B CPU
The number of cpus on the host.
.TP .6i
.B LOAD
The host load average.
.TP .6i
.B RATING
The host rating, usually in network relative mips.
.TP .6i
.B BIAS
The scheduling bias.
Hosts with lower bias are more likely to be scheduled.
.TP .6i
.B TYPE
The host type, usually related to object and executable attributes.
.TP .6i
.B HOST
The host name.
.PD
.RE
.PP
The status fields for \fBso\fP, \fBss\fP, and \fBst\fP are:
.RS
.PD 0
.TP .6i
.B CON
The id assigned to the open shell by the server,
.B \@
if the shell is not open and is not in the processor pool,
.B \-
if the shell is not open, and
.B +
if an open is in progress.
.TP .6i
.B OPEN
The accumulated number of times the server has connected to the host.
.TP .6i
.B USERS
The current number of active users.
.TP .6i
.B UP
The amount of time the host has been up.
.TP .6i
.B CONNECT
The amount of time the server has connected to the host.
.TP .6i
.B UPDATE
The amount of time before the host status information is out-of-date.
.TP .6i
.B OVERRIDE
The amount of time of keeping the host connection followed
by the host identification code, 1 for the local host, 0 for other
hosts in the network.
.TP .6i
.B IDLE
The specified idle time.
.TP .6i
.B TEMP
A measure of the recent job activities running on the host.
.TP .6i
.B RANK
A measure of the desirability of the host. This takes idle time
restriction, load average, and the number of CPU into account.
Two digits after the decimal point are random numbers which are
used to break ties between different
.I coshell
servers. Hosts with lower
.B RANK
are more likely to be scheduled.
.TP .6i
.B HOST
The host name.
.PD
.RE
.TP
.B t
List the accumulated totals.
The fields are:
.RS
.PD 0
.TP .6i
.B SHELLS
The number of active shell connections followed by the total number
of successful shell connections.
.TP .6i
.B USERS
The number of active user connections followed by the total number
of successful user connections.
.TP .6i
.B JOBS
The number of active jobs followed by the total number
of jobs run.
.TP .6i
.B CMDS
The number of server-user transactions.
.TP .6i
.B UP
The elapsed time since the server started.
.TP .6i
.B REAL
The elapsed time during which the USER and SYS times were accumulated.
.TP .6i
.B USER
The accumulated user time for all jobs on all hosts.
.TP .6i
.B SYS
The accumulated sys time for all jobs on all hosts.
.TP .6i
.B CPU
The number of cpus available on all connected hosts followed by the
processor pool cpu limit plus the explicit host override count.
An
.I override
host is a connected host that does not meet the processor pool
idle time requirements.
.TP .6i
.B LOAD
The load average, averaged over all connected hosts.
.TP .6i
.B RATING
The host rating, averaged over all connected hosts.
.PD
.RE
.TP
.B u
List connected user status.
The status fields are:
.RS
.PD 0
.TP .6i
.B CON
The id assigned to the user connection by the server.
.TP .6i
.B PID
The user process id.
.TP .6i
.B JOBS
The number of jobs currently running on behalf of the user.
.TP .6i
.B TOTAL
The total number of jobs requested by the user.
.TP .6i
.B TTY
The user process
.B stderr
file name.
.TP .6i
.B label
The label assigned to the connection by the requesting user.
.PD
.RE
.TP
.B v
List the server version stamp.
.PP
The interactive commands are useful in terms of tuning some global
variable values. For example, one could set
.B NPROC
to 100, export it,
and control the number of jobs executed using the
.I coshell
interactive command:
.EX
coshell> a local,peruser=10,perserver=40
.EE
.PP
The interactive commands may be used as options for non-interactive
.I coshell
queries.
For example,
.L "coshell -sl"
produces a long shell status listing and
.L "coshell -c dodo"
closes the shell connection to the host
.LR dodo .
.SH EXAMPLES
The following environment variables must be set if
.I coshell
is installed in a non-standard directory (not
.BI /bin ,
.BR /usr/bin ,
or
.BR /usr/local/bin ):
.EX
root=<coshell-installation-root-directory>
export PATH=$root/bin:$PATH
.EE
If
.I coshell
is dynamically linked, the
.B LD_LIBRARY_PATH
environment variable needs to
be set.
.EX
export LD_LIBRARY_PATH=$root/lib:$LD_LIBRARY_PATH
.EE
.PP
The following two commands are used to generate the local
network host information which is shared among all the
.I coshell
users and only needs to be generated once unless this information needs
to be updated.
If you run into permission problems, contact your system administrator.
.EX
genshare > $root/lib/cs/share
genlocal > $root/share/lib/cs/local
.EE
The
.I genshare
command is run first to generate information on servers for the network.
By default, this information is stored in $root/lib/cs/share.
Based on this information,
.I genlocal
is run to generate the local host attribute file. By default,
this information is stored in $root/share/lib/cs/local. If the
.I share
file generated by the
.I genshare
command is not stored in the default path, you need to pass its path
to
the
.I genlocal
command using the -f option.
.PP
You may also modify the generated files to meet your needs.
.PP
A sample local host attribute file follows:
.EX
#
# local host attributes
#
local pool=8 bias=4 busy=1m
server type=sun4 rating=20
cruncher type=mips rating=30 cpu=20
station type=sun3 rating=6 idle=15m
token type=3b rating=0.1 idle=15m
.EE
The
.L local
entry sets the processor pool size to 8, the local host bias
to 4, and the busy host grace period to 1 minute.
Compute servers that are available to all users usually have no
.L idle
attribute whereas personal workstations are given at least
.L idle=15m
out of courtesy to the workstation owner.
.PP
The following starts the
.I coshell
server.
The processor pool size is taken from the local host attribute file.
.EX
coshell +
.EE
The following instruct programs using
.IR coshell (3)
to use
.I coshell
rather than
.I ksh
or
.I sh
for command execution and sets the command execution concurrency level to 8.
.EX
export COSHELL=coshell
export NPROC=8
.EE
.PP
The shell function
.I cosh
provides a convenient interface for common coshell actions:
.EX
export FPATH=$root/fun:$FPATH
# start coshell, export COSHELL,NPROC, and set window title
cosh
\fIcoshell (AT&T Bell Laboratories) 10/11/93\fP [\fIfirst time only\fP]
# run hostname on best host
on - hostname
\fIdodo\fP
# interact with server ...
cosh -
\fIcoshell>\fP
.EE
.SH CAVEATS
A
.I coshell
connect stream file is created in the
.L /tmp/cs
directory.
Some systems:
.RS
.TP
.B (1)
do not update the times on the connect stream file when it is accessed
.TP
.B (2)
automatically remove stale files from
.L /tmp
.TP
.B (3)
fail to generate a
.IR poll (2)
or
.IR select (2)
event when the connect stream file is removed
.TP
.B (4)
do not handle mounted streams or sockets.
.RE
.PP
In any of these cases, the environment variable
.BR CS_MOUNT_LOCAL
needs to be set to another file system where
all the users have read and write permissions. For example:
.EX
export CS_MOUNT_LOCAL=<coshell-installation-root-directory>/tmp
.EE
.PP
On some systems the server may not detect that its connect stream
file has been unlinked,
resulting in erroneous `server not running' errors.
To handle this situation the server checks and recreates the connect
stream file on receipt of a
.B SIGINT
signal.
.PP
NFS cache inconsistencies may arise for files generated via NFS on remote hosts
but serviced via the native file system on the local host.
Running
.I coshell
from a diskless host avoids the problem.
.PP
Host load average and logged in user idle times are used
to schedule hosts and jobs.
Some terminal lock programs, e.g.,
.IR xlock (1),
inflate the load average, usually doing complex graphics operations
on displays that have long since been blanked out by an independent
screen saver.
A simple lock program that blocks on a read request may open up idle cycles
for better use.
.SH ENVIRONMENT
.PD 0
.TP 1.2i
.B COATTRIBUTES
Host attribute expression,
.B (type@local)
by default.
Non-numeric valued attributes may appear as the first operand of
the comparison operators
.BR < ,
.BR <= ,
.BR == ,
.BR != ,
.BR >=
and
.BR > ,
where the second operand must be a \fB"..."\fP or \fB'...'\fP string
that is compared with the attribute value.
For the
.B ==
and
.B !=
operators the second operand is taken to be a
.IR ksh (1)
file match pattern.
For example, given the host definitions:
.EX
coot type=sun4 mem=8m rating=11.0 cad
dodo type=sun3 mem=4m rating=2.0
loon type=mips mem=16m rating=20.0
.EE
.L "(type=='sun*'&&mem>6m)"
selects
.LR coot ,
.L "(rating>=11.0)"
selects
.L coot
and
.LR loon ,
and
.L "(cad)"
selects
.LR coot .
.IR attribute @ host
represents the
.I attribute
value for
.IR host .
For example,
.L type@local
matches the type of the host running the
.I coshell
server.
.TP 1.2i
.B COEXPORT
A colon separated list of environment variables to export to each job.
This is to support the rare cases where some environment variables
change after the
.I coshell
server has been started.
For example, some commands use environment variables rather
than arguments or options to pass input data.
.TP 1.2i
.B COSHELL
Set to
.L coshell
for the network shell service
.TP 1.2i
.B COTEMP
Set to a different value for each shell command. It is used for
temporary file names. (see Engine Variables in
.IR nmake(1) )
This variable may be referenced in
.B .profile.
.TP 1.2i
.B HOMEHOST
Set within each action to the name of the host executing
.IR coshell .
.TP 1.2i
.B HOSTNAME
Set within each action to the name of the host executing the action.
This variable may be referenced in
.B .profile.
.TP 1.2i
.B HOSTTYPE
Set within each action to the type
(from the local coshell host attribute file)
of the host executing the action.
This variable may be referenced in
.B .profile.
.TP 1.2i
.B NPROC
Default command concurrency level
.PD
.SH FILES
.TP 2i
.B share/lib/cs/local
local network host attributes
.SH AUTHOR
Glenn Fowler
.br
gsf@research.att.com
.br
AT&T Bell Laboratories
.SH "SEE ALSO"
3d(1), ksh(1), nmake(1), rsh(1), ss(1), coshell(3), cs(3)