svc.c revision 2695d4f4d1e2a6022c8a279d40c3cb750964974d
/*
* CDDL HEADER START
*
* The contents of this file are subject to the terms of the
* Common Development and Distribution License (the "License").
* You may not use this file except in compliance with the License.
*
* You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
* See the License for the specific language governing permissions
* and limitations under the License.
*
* When distributing Covered Code, include this CDDL HEADER in each
* file and include the License file at usr/src/OPENSOLARIS.LICENSE.
* If applicable, add the following below this CDDL HEADER, with the
* fields enclosed by brackets "[]" replaced with your own identifying
* information: Portions Copyright [yyyy] [name of copyright owner]
*
* CDDL HEADER END
*/
/*
* Copyright 2015 Nexenta Systems, Inc. All rights reserved.
*/
/*
* Copyright 2010 Sun Microsystems, Inc. All rights reserved.
* Use is subject to license terms.
*/
/*
* Copyright 1993 OpenVision Technologies, Inc., All Rights Reserved.
*/
/* Copyright (c) 1983, 1984, 1985, 1986, 1987, 1988, 1989 AT&T */
/* All Rights Reserved */
/*
* Portions of this source code were derived from Berkeley 4.3 BSD
* under license from the Regents of the University of California.
*/
/*
* Server-side remote procedure call interface.
*
* Master transport handle (SVCMASTERXPRT).
* The master transport handle structure is shared among service
* threads processing events on the transport. Some fields in the
* master structure are protected by locks
* - xp_req_lock protects the request queue:
* xp_req_head, xp_req_tail, xp_reqs, xp_size, xp_full, xp_enable
* - xp_thread_lock protects the thread (clone) counts
* xp_threads, xp_detached_threads, xp_wq
* Each master transport is registered to exactly one thread pool.
*
* Clone transport handle (SVCXPRT)
* The clone transport handle structure is a per-service-thread handle
* for request processing. A service thread or, in other words, a clone
* structure, can be linked to an arbitrary master structure to process
* requests on this transport. The master handle keeps track of reference
* counts of threads (clones) linked to it. A service thread can switch
* to another transport by unlinking its clone handle from the current
* transport and linking to a new one. Switching is relatively inexpensive
* but it involves locking (master's xprt->xp_thread_lock).
*
* Pools.
* A pool represents a kernel RPC service (NFS, Lock Manager, etc.).
* Transports related to the service are registered to the service pool.
* Service threads can switch between different transports in the pool.
* Thus, each service has its own pool of service threads. The maximum
* number of threads in a pool is pool->p_maxthreads. This limit allows
* to restrict resource usage by the service. Some fields are protected
* by locks:
* - p_req_lock protects several counts and flags:
* p_reqs, p_size, p_walkers, p_asleep, p_drowsy, p_req_cv
* - p_thread_lock governs other thread counts:
* p_threads, p_detached_threads, p_reserved_threads, p_closing
*
* In addition, each pool contains a doubly-linked list of transports,
* an `xprt-ready' queue and a creator thread (see below). Threads in
* the pool share some other parameters such as stack size and
* polling timeout.
*
* Pools are initialized through the svc_pool_create() function called from
* the nfssys() system call. However, thread creation must be done by
* the userland agent. This is done by using SVCPOOL_WAIT and
* SVCPOOL_RUN arguments to nfssys(), which call svc_wait() and
* svc_do_run(), respectively. Once the pool has been initialized,
* the userland process must set up a 'creator' thread. This thread
* should park itself in the kernel by calling svc_wait(). If
* svc_wait() returns successfully, it should fork off a new worker
* thread, which then calls svc_do_run() in order to get work. When
* that thread is complete, svc_do_run() will return, and the user
* program should call thr_exit().
*
* When we try to register a new pool and there is an old pool with
* the same id in the doubly linked pool list (this happens when we kill
* and restart nfsd or lockd), then we unlink the old pool from the list
* and mark its state as `closing'. After that the transports can still
* process requests but new transports won't be registered. When all the
* transports and service threads associated with the pool are gone the
* creator thread (see below) will clean up the pool structure and exit.
*
* svc_queuereq() and svc_run().
* The kernel RPC server is interrupt driven. The svc_queuereq() interrupt
* routine is called to deliver an RPC request. The service threads
* loop in svc_run(). The interrupt function queues a request on the
* transport's queue and it makes sure that the request is serviced.
* It may either wake up one of sleeping threads, or ask for a new thread
* to be created, or, if the previous request is just being picked up, do
* nothing. In the last case the service thread that is picking up the
* previous request will wake up or create the next thread. After a service
* thread processes a request and sends a reply it returns to svc_run()
* and svc_run() calls svc_poll() to find new input.
*
* svc_poll().
* In order to avoid unnecessary locking, which causes performance
* problems, we always look for a pending request on the current transport.
* If there is none we take a hint from the pool's `xprt-ready' queue.
* If the queue had an overflow we switch to the `drain' mode checking
* each transport in the pool's transport list. Once we find a
* master transport handle with a pending request we latch the request
* lock on this transport and return to svc_run(). If the request
* belongs to a transport different than the one the service thread is
* linked to we need to unlink and link again.
*
* A service thread goes asleep when there are no pending
* requests on the transports registered on the pool's transports.
* All the pool's threads sleep on the same condition variable.
* If a thread has been sleeping for too long period of time
* (by default 5 seconds) it wakes up and exits. Also when a transport
* is closing sleeping threads wake up to unlink from this transport.
*
* The `xprt-ready' queue.
* If a service thread finds no request on a transport it is currently linked
* to it will find another transport with a pending request. To make
* this search more efficient each pool has an `xprt-ready' queue.
* The queue is a FIFO. When the interrupt routine queues a request it also
* inserts a pointer to the transport into the `xprt-ready' queue. A
* thread looking for a transport with a pending request can pop up a
* transport and check for a request. The request can be already gone
* since it could be taken by a thread linked to that transport. In such a
* case we try the next hint. The `xprt-ready' queue has fixed size (by
* default 256 nodes). If it overflows svc_poll() has to switch to the
* less efficient but safe `drain' mode and walk through the pool's
* transport list.
*
* Both the svc_poll() loop and the `xprt-ready' queue are optimized
* for the peak load case that is for the situation when the queue is not
* empty, there are all the time few pending requests, and a service
* thread which has just processed a request does not go asleep but picks
* up immediately the next request.
*
* Thread creator.
* Each pool has a thread creator associated with it. The creator thread
* sleeps on a condition variable and waits for a signal to create a
* service thread. The actual thread creation is done in userland by
* the method described in "Pools" above.
*
* Signaling threads should turn on the `creator signaled' flag, and
* can avoid sending signals when the flag is on. The flag is cleared
* when the thread is created.
*
* When the pool is in closing state (ie it has been already unregistered
* from the pool list) the last thread on the last transport in the pool
* should turn the p_creator_exit flag on. The creator thread will
* clean up the pool structure and exit.
*
* Thread reservation; Detaching service threads.
* A service thread can detach itself to block for an extended amount
* of time. However, to keep the service active we need to guarantee
* at least pool->p_redline non-detached threads that can process incoming
* requests. This, the maximum number of detached and reserved threads is
* p->p_maxthreads - p->p_redline. A service thread should first acquire
* a reservation, and if the reservation was granted it can detach itself.
* If a reservation was granted but the thread does not detach itself
* it should cancel the reservation before it returns to svc_run().
*/
/*
* Defines for svc_poll()
*/
/*
* Default stack size for service threads.
*/
#define DEFAULT_SVC_RUN_STKSIZE (0) /* default kernel stack */
/*
* Default polling timeout for service threads.
* Multiplied by hz when used.
*/
/*
* Size of the `xprt-ready' queue.
*/
/*
* Default limit for the number of service threads.
*/
#define DEFAULT_SVC_MAXTHREADS (INT16_MAX)
/*
* Maximum number of requests from the same transport (in `drain' mode).
*/
#define DEFAULT_SVC_MAX_SAME_XPRT (8)
/*
* Default `Redline' of non-detached threads.
* Total number of detached and reserved threads in an RPC server
* thread pool is limited to pool->p_maxthreads - svc_redline.
*/
#define DEFAULT_SVC_REDLINE (1)
/*
* A node for the `xprt-ready' queue.
* See below.
*/
struct __svcxprt_qnode {
};
/*
* Global SVC variables (private).
*/
struct svc_globals {
};
/*
* Debug variable to check for rdma based
* transport startup and cleanup. Contorlled
*/
int rdma_check = 0;
/*
* This allows disabling flow control in svc_queuereq().
*/
volatile int svc_flowcontrol_disable = 0;
/*
* Authentication parameters list.
*/
static caddr_t rqcred_head;
static kmutex_t rqcred_lock;
/*
* Pointers to transport specific `rele' routines in rpcmod (set from rpcmod).
*/
/* ARGSUSED */
void
{
}
/*
* This macro picks which `rele' routine to use, based on the transport type.
*/
/*
* If true, then keep quiet about version mismatch.
* This macro is for broadcast RPC only. We have no broadcast RPC in
* kernel now but one may define a flag in the transport structure
* and redefine this macro.
*/
/*
* ZSD key used to retrieve zone-specific svc globals
*/
static zone_key_t svc_zone_key;
static void svc_callout_free(SVCMASTERXPRT *);
static void svc_xprt_qdestroy(SVCPOOL *);
static void svc_thread_creator(SVCPOOL *);
static void svc_creator_signal(SVCPOOL *);
static void svc_creator_signalexit(SVCPOOL *);
/* ARGSUSED */
static void *
{
struct svc_globals *svc;
return (svc);
}
/* ARGSUSED */
static void
{
}
}
/* ARGSUSED */
static void
{
}
/*
* Global SVC init routine.
* Initialize global generic and transport type specific structures
* used by the kernel RPC server side. This routine is called only
* once when the module is being loaded.
*/
void
svc_init()
{
}
/*
* Destroy the SVCPOOL structure.
*/
static void
{
/*
* Call the user supplied shutdown function. This is done
* here so the user of the pool will be able to cleanup
* service related resources.
*/
(pool->p_shutdown)();
/* Destroy `xprt-ready' queue */
/* Destroy transport list */
/* Destroy locks and condition variables */
/* Destroy creator's locks and condition variables */
/* Free pool structure */
}
/*
* If all the transports and service threads are already gone
* signal the creator thread to clean up and exit.
*/
static bool_t
{
/*
* Release the locks before sending a signal.
*/
/*
* Notify the creator thread to clean up and exit
*
* NOTICE: No references to the pool beyond this point!
* The pool is being destroyed.
*/
return (TRUE);
}
}
return (FALSE);
}
/*
* Find a pool with a given id.
*/
static SVCPOOL *
{
/*
* Search the list for a pool with a matching id
* and register the transport handle with that pool.
*/
return (pool);
return (NULL);
}
/*
* PSARC 2003/523 Contract Private Interface
* svc_do_run
* Changes must be reviewed by Solaris File Sharing
* Changes must be communicated to contract-2003-523@sun.com
*/
int
svc_do_run(int id)
{
int err = 0;
struct svc_globals *svc;
return (ENOENT);
/*
* Increment counter of pool threads now
* that a thread has been created.
*/
/* Give work to the new thread. */
return (err);
}
/*
* Unregister a pool from the pool list.
* Set the closing state. If all the transports and service threads
* are already gone signal the creator thread to clean up and exit.
*/
static void
{
/* Remove from the list */
if (next)
if (prev)
/*
* Offline the pool. Mark the pool as closing.
* If there are no transports in this pool notify
* the creator thread to clean it up and exit.
*/
if (svc_pool_tryexit(pool))
return;
}
/*
* Register a pool with a given id in the global doubly linked pool list.
* - if there is a pool with the same id in the list then unregister it
* - insert the new pool into the list.
*/
static void
{
/*
* If there is a pool with the same id then remove it from
* the list and mark the pool as closing.
*/
/* Insert into the doubly linked list */
}
/*
* Initialize a newly created pool structure
*/
static int
{
if (maxthreads == 0)
if (redline == 0)
if (qsize == 0)
if (timeout == 0)
if (stksize == 0)
if (max_same_xprt == 0)
if (maxthreads < redline)
return (EINVAL);
/* Allocate and initialize the `xprt-ready' queue */
/* Initialize doubly-linked xprt list */
/*
* Setting lwp_childstksz on the current lwp so that
* descendants of this lwp get the modified stacksize, if
* it is defined. It is important that either this lwp or
* one of its descendants do the actual servicepool thread
* creation to maintain the stacksize inheritance.
*/
/* Initialize thread limits, locks and condition variables */
/* Initialize userland creator */
/* Initialize the creator and start the creator thread */
pool, 0, minclsyspri);
return (0);
}
/*
* PSARC 2003/523 Contract Private Interface
* svc_pool_create
* Changes must be reviewed by Solaris File Sharing
* Changes must be communicated to contract-2003-523@sun.com
*
*
* This is public interface for creation of a server RPC thread pool
* for a given service provider. Transports registered with the pool's id
* will be served by a pool's threads. This function is called from the
* nfssys() system call.
*/
int
{
int error;
struct svc_globals *svc;
/*
* Caller should check credentials in a way appropriate
* in the context of the call.
*/
/* Allocate a new pool */
/*
* Initialize the pool structure and create a creator thread.
*/
if (error) {
return (error);
}
/* Register the pool with the global pool list */
return (0);
}
int
{
struct svc_globals *svc;
switch (cmd) {
case SVCPSET_SHUTDOWN_PROC:
/*
* Search the list for a pool with a matching id
* and register the transport handle with that pool.
*/
return (ENOENT);
}
/*
* Grab the transport list lock before releasing the
* pool list lock
*/
return (0);
case SVCPSET_UNREGISTER_PROC:
/*
* Search the list for a pool with a matching id
* and register the unregister callback handle with that pool.
*/
return (ENOENT);
}
/*
* Grab the transport list lock before releasing the
* pool list lock
*/
return (0);
default:
return (EINVAL);
}
}
/*
* Pool's transport list manipulation routines.
* - svc_xprt_register()
* - svc_xprt_unregister()
*
* svc_xprt_register() is called from svc_tli_kcreate() to
* insert a new master transport handle into the doubly linked
* list of server transport handles (one list per pool).
*
* The list is used by svc_poll(), when it operates in `drain'
* mode, to search for a next transport with a pending request.
*/
int
{
struct svc_globals *svc;
/*
* Search the list for a pool with a matching id
* and register the transport handle with that pool.
*/
return (ENOENT);
}
/* Grab the transport list lock before releasing the pool list lock */
/* Don't register new transports when the pool is in closing state */
return (EBUSY);
}
/*
* Initialize xp_pool to point to the pool.
* We don't want to go through the pool list every time.
*/
/*
* Insert a transport handle into the list.
* The list head points to the most recently inserted transport.
*/
else {
}
/* Increment the transports count */
return (0);
}
/*
* Called from svc_xprt_cleanup() to remove a master transport handle
* from the pool's list of server transports (when a transport is
* being destroyed).
*/
void
{
/*
* Unlink xprt from the list.
* If the list head points to this xprt then move it
* to the next xprt or reset to NULL if this is the last
* xprt in the list.
*/
else {
}
/* Decrement list count */
}
static void
{
}
/*
* Initialize an `xprt-ready' queue for a given pool.
*/
static void
{
int i;
KM_SLEEP);
}
/*
* Called from the svc_queuereq() interrupt routine to queue
* a hint for svc_poll() which transport has a pending request.
* - insert a pointer to xprt into the xprt-ready queue (FIFO)
* - if the xprt-ready queue is full turn the overflow flag on.
*
* NOTICE: pool->p_qtop is protected by the pool's request lock
* and the caller (svc_queuereq()) must hold the lock.
*/
static void
{
/* If the overflow flag is on there is nothing we can do */
if (pool->p_qoverflow)
return;
/* If the queue is full turn the overflow flag on and exit */
return;
}
}
/* Insert a hint and move pool->p_qtop */
}
/*
* Called from svc_poll() to get a hint which transport has a
* pending request. Returns a pointer to a transport or NULL if the
* `xprt-ready' queue is empty.
*
* Since we do not acquire the pool's request lock while checking if
* the queue is empty we may miss a request that is just being delivered.
* However this is ok since svc_poll() will retry again until the
* count indicates that there are pending requests for this pool.
*/
static SVCMASTERXPRT *
{
do {
/*
* If the queue is empty return NULL.
* Since we do not acquire the pool's request lock which
* protects pool->p_qtop this is not exact check. However,
* this is safe - if we miss a request here svc_poll()
* will retry again.
*/
return (NULL);
}
/* Get a hint and move pool->p_qend */
/* Skip fields deleted by svc_xprt_qdelete() */
return (xprt);
}
/*
* Delete all the references to a transport handle that
* is being destroyed from the xprt-ready queue.
* Deleted pointers are replaced with NULLs.
*/
static void
{
__SVCXPRT_QNODE *q;
}
}
/*
* Destructor for a master server transport handle.
* - if there are no more non-detached threads linked to this transport
* then, if requested, call xp_closeproc (we don't wait for detached
* threads linked to this transport to complete).
* - if there are no more threads linked to this
* transport then
* a) remove references to this transport from the xprt-ready queue
* b) remove a reference to this transport from the pool's transport list
* c) call a transport specific `destroy' function
* d) cancel remaining thread reservations.
*
* NOTICE: Caller must hold the transport's thread lock.
*/
static void
{
/*
* If called from the last non-detached thread
* it should call the closeproc on this transport.
*/
}
else {
/* Remove references to xprt from the `xprt-ready' queue */
/* Unregister xprt from the pool's transport list */
}
}
/*
* This function is called from svc_getreq() to search the callout
* table for an entry with a matching RPC program number `prog'
* and a version range that covers `vers'.
* - if it finds a matching entry it returns pointer to the dispatch routine
* - otherwise it returns NULL and, if `minp' or `maxp' are not NULL,
* fills them with, respectively, lowest version and highest version
* supported for the program `prog'
*/
static SVC_DISPATCH *
{
int i;
*vers_max = 0;
return (sc->sc_dispatch);
}
}
return (NULL);
}
/*
* Optionally free callout table allocated for this transport by
* the service provider.
*/
static void
{
}
}
/*
* Send a reply to an RPC request
*
* PSARC 2003/523 Contract Private Interface
* svc_sendreply
* Changes must be reviewed by Solaris File Sharing
* Changes must be communicated to contract-2003-523@sun.com
*/
const caddr_t xdr_location)
{
}
/*
* No procedure error reply
*
* PSARC 2003/523 Contract Private Interface
* svcerr_noproc
* Changes must be reviewed by Solaris File Sharing
* Changes must be communicated to contract-2003-523@sun.com
*/
void
{
}
/*
* Can't decode arguments error reply
*
* PSARC 2003/523 Contract Private Interface
* svcerr_decode
* Changes must be reviewed by Solaris File Sharing
* Changes must be communicated to contract-2003-523@sun.com
*/
void
{
}
/*
* Some system error
*/
void
{
}
/*
* Authentication error reply
*/
void
{
}
/*
* Authentication too weak error reply
*/
void
{
}
/*
* Authentication error; bad credentials
*/
void
{
}
/*
* Program unavailable error reply
*
* PSARC 2003/523 Contract Private Interface
* svcerr_noprog
* Changes must be reviewed by Solaris File Sharing
* Changes must be communicated to contract-2003-523@sun.com
*/
void
{
}
/*
* Program version mismatch error reply
*
* PSARC 2003/523 Contract Private Interface
* svcerr_progvers
* Changes must be reviewed by Solaris File Sharing
* Changes must be communicated to contract-2003-523@sun.com
*/
void
{
}
/*
* Get server side input from some transport.
*
* Statement of authentication parameters management:
* This function owns and manages all authentication parameters, specifically
* the "raw" parameters (msg.rm_call.cb_cred and msg.rm_call.cb_verf) and
* the "cooked" credentials (rqst->rq_clntcred).
* However, this function does not know the structure of the cooked
* credentials, so it make the following assumptions:
* a) the structure is contiguous (no pointers), and
* b) the cred structure size does not exceed RQCRED_SIZE bytes.
* In all events, all three parameters are freed upon exit from this routine.
* The storage is trivially managed on the call stack in user land, but
* is malloced in kernel land.
*
* Note: the xprt's xp_svc_lock is not held while the service's dispatch
* routine is running. If we decide to implement svc_unregister(), we'll
* need to decide whether it's okay for a thread to unregister a service
* while a request is being processed. If we decide that this is a
* problem, we can probably use some sort of reference counting scheme to
* keep the callout entry from going away until the request has completed.
*/
static void
{
struct svc_req r;
char *cred_area; /* too big to allocate on call stack */
"svc_getreq_start:");
/*
* Firstly, allocate the authentication parameters' storage
*/
if (rqcred_head) {
/* LINTED pointer alignment */
} else {
KM_SLEEP);
}
/*
* underlying transport recv routine may modify mblk data
* and make it difficult to extract label afterwards. So
* get the label from the raw mblk data now.
*/
if (is_system_labeled()) {
sizeof (bslabel_t));
} else {
}
/*
* Now receive a message from the transport.
*/
/*
* Find the registered program and call its
* dispatch routine.
*/
r.rq_xprt = clone_xprt;
/*
* First authenticate the message.
*/
"svc_getreq_auth_start:");
"svc_getreq_auth_end:(%S)", "failed");
/*
* Free the arguments.
*/
} else if (no_dispatch) {
/*
* XXX - when bug id 4053736 is done, remove
* the SVC_FREEARGS() call.
*/
} else {
"svc_getreq_auth_end:(%S)", "good");
if (dispatchroutine) {
(*dispatchroutine) (&r, clone_xprt);
} else {
/*
* If we got here, the program or version
* is not served ...
*/
if (vers_max == 0 ||
else
vers_max);
/*
* Free the arguments. For successful calls
* this is done by the dispatch routine.
*/
/* Fall through to ... */
}
/*
* Call cleanup procedure for RPCSEC_GSS.
* This is a hack since there is currently no
* op, such as SVC_CLEANAUTH. rpc_gss_cleanup
* should only be called for a non null proc.
* Null procs in RPC GSS are overloaded to
* provide context setup and control. The main
* purpose of rpc_gss_cleanup is to decrement the
* reference count associated with the cached
* GSS security context. We should never get here
* for an RPCSEC_GSS null proc since *no_dispatch
* would have been set to true from sec_svc_msg above.
*/
}
}
/*
* Free authentication parameters' storage
*/
/* LINTED pointer alignment */
}
/*
* Allocate new clone transport handle.
*/
SVCXPRT *
svc_clone_init(void)
{
return (clone_xprt);
}
/*
* Free memory allocated by svc_clone_init.
*/
void
{
/* Fre credentials from crget() */
if (clone_xprt->xp_cred)
}
/*
* Link a per-thread clone transport handle to a master
* - increment a thread reference count on the master
* - copy some of the master's fields to the clone
* - call a transport specific clone routine.
*/
void
{
/*
* Bump up master's thread count.
* Linking a per-thread clone transport handle to a master
* associates a service thread with the master.
*/
xprt->xp_threads++;
/* Clear everything */
/* Set pointer to the master transport stucture */
/* Structure copy of all the common fields */
/* Restore per-thread fields (xp_cred) */
if (clone_xprt2)
}
/*
* Unlink a non-detached clone transport handle from a master
* - decrement a thread reference count on the master
* - if the transport is closing (xp_wq is NULL) call svc_xprt_cleanup();
* if this is the last non-detached/absolute thread on this transport
* - call transport specific function to destroy the clone handle
* - clear xp_master to avoid recursion.
*/
void
{
/* This cannot be a detached thread */
/* Decrement a reference count on the transport */
xprt->xp_threads--;
/* svc_xprt_cleanup() unlocks xp_thread_lock or destroys xprt */
else
/* Call a transport specific clone `destroy' function */
/* Clear xp_master */
}
/*
* Unlink a detached clone transport handle from a master
* - decrement the thread count on the master
* - if the transport is closing (xp_wq is NULL) call svc_xprt_cleanup();
* if this is the last thread on this transport then it will destroy
* the transport.
* - call a transport specific function to destroy the clone handle
* - clear xp_master to avoid recursion.
*/
static void
{
/* This must be a detached thread */
/* Grab xprt->xp_thread_lock and decrement link counts */
/* svc_xprt_cleanup() unlocks xp_thread_lock or destroys xprt */
else
/* Call transport specific clone `destroy' function */
/* Clear xp_master */
}
/*
* Try to exit a non-detached service thread
* - check if there are enough threads left
* - if this thread (ie its clone transport handle) are linked
* to a master transport then unlink it
* - free the clone structure
* - return to userland for thread exit
*
* If this is the last non-detached or the last thread on this
* transport then the call to svc_clone_unlink() will, respectively,
*/
static void
{
if (clone_xprt->xp_master)
/* return - thread exit will be handled at user level */
return;
/* return - thread exit will be handled at user level */
}
/*
* Exit a detached service thread that returned to svc_run
* - decrement the `detached thread' count for the pool
* - unlink the detached clone transport handle from the master
* - free the clone structure
* - return to userland for thread exit
*
* If this is the last thread on this transport then the call
* to svc_clone_unlinkdetached() will destroy the transport.
*/
static void
{
/* This must be a detached thread */
/* return - thread exit will be handled at user level */
return;
/* return - thread exit will be handled at user level */
}
/*
* PSARC 2003/523 Contract Private Interface
* svc_wait
* Changes must be reviewed by Solaris File Sharing
* Changes must be communicated to contract-2003-523@sun.com
*/
int
{
int err = 0;
struct svc_globals *svc;
return (ENOENT);
/* Check if there's already a user thread waiting on this pool */
if (pool->p_user_waiting) {
return (EBUSY);
}
/* Go to sleep, waiting for the signaled flag. */
/* Interrupted, return to handle exit or signal */
/*
* Thread has been interrupted and therefore
* the service daemon is leaving as well so
* let's go ahead and remove the service
* pool at this time.
*/
return (EINTR);
}
}
/*
* About to exit the service pool. Set return value
* to let the userland code know our intent. Signal
* svc_thread_creator() so that it can clean up the
* pool structure.
*/
if (pool->p_user_exit) {
}
/* Return to userland with error code, for possible thread creation. */
return (err);
}
/*
* `Service threads' creator thread.
* The creator thread waits for a signal to create new thread.
*/
static void
{
"svc_thread_creator");
for (;;) {
/* Check if someone set the exit flag */
if (pool->p_creator_exit)
break;
/* Clear the `signaled' flag and go asleep */
/* Check if someone signaled to exit */
if (pool->p_creator_exit)
break;
/*
* When the pool is in closing state and all the transports
* are gone the creator should not create any new threads.
*/
continue;
}
}
/*
* Create a new service thread now.
*/
pool->p_maxthreads) {
/*
* Signal the service pool wait thread
* only if it hasn't already been signaled.
*/
}
}
}
/*
* Pool is closed. Cleanup and exit.
*/
/* Signal userland creator thread that it can stop now. */
/* Wait for svc_wait() to be done with the pool */
while (pool->p_user_waiting) {
}
zthread_exit();
}
/*
* If the creator thread is idle signal it to create
* a new service thread.
*/
static void
{
}
}
/*
* Notify the creator thread to clean up and exit.
*/
static void
{
}
/*
* Polling part of the svc_run().
* - search for a transport with a pending request
* - when one is found then latch the request lock and return to svc_run()
* - if there is no request go asleep and wait for a signal
* - handle two exceptions:
* a) current transport is closing
* b) timeout waiting for a new request
* in both cases return to svc_run()
*/
static SVCMASTERXPRT *
{
/*
* Main loop iterates until
* a) we find a pending request,
* b) detect that the current transport is closing
* c) time out waiting for a new request.
*/
for (;;) {
/*
* Step 1.
* Check if there is a pending request on the current
* transport handle so that we can avoid cloning.
* If so then decrement the `pending-request' count for
* the pool and return to svc_run().
*
* We need to prevent a potential starvation. When
* a selected transport has all pending requests coming in
* all the time then the service threads will never switch to
* another transport. With a limited number of service
* threads some transports may be never serviced.
* To prevent such a scenario we pick up at most
* pool->p_max_same_xprt requests from the same transport
* and then take a hint from the xprt-ready queue or walk
* the transport list.
*/
if (xprt->xp_req_head)
return (xprt);
}
clone_xprt->xp_same_xprt = 0;
/*
* Step 2.
* If there is no request on the current transport try to
* find another transport with a pending request.
*/
/*
* Make sure that transports will not be destroyed just
* while we are checking them.
*/
for (;;) {
/*
* Get the next transport from the xprt-ready queue.
* This is a hint. There is no guarantee that the
* transport still has a pending request since it
* could be picked up by another thread in step 1.
*
* If the transport has a pending request then keep
* it locked. Decrement the `pending-requests' for
* the pool and `walking-threads' counts, and return
* to svc_run().
*/
if (hint->xp_req_head) {
return (hint);
}
}
/*
* If there was no hint in the xprt-ready queue then
* - if there is less pending requests than polling
* threads go asleep
* - otherwise check if there was an overflow in the
* xprt-ready queue; if so, then we need to break
* the `drain' mode
*/
goto sleep;
}
if (pool->p_qoverflow) {
break;
}
}
}
/*
* If there was an overflow in the xprt-ready queue then we
* need to switch to the `drain' mode, i.e. walk through the
* pool's transport list and search for a transport with a
* pending request. If we manage to drain all the pending
* requests then we can clear the overflow flag. This will
* switch svc_poll() back to taking hints from the xprt-ready
* queue (which is generally more efficient).
*
* If there are no registered transports simply go asleep.
*/
goto sleep;
}
/*
* `Walk' through the pool's list of master server
* transport handles. Continue to loop until there are less
* looping threads then pending requests.
*/
for (;;) {
/*
* Check if there is a request on this transport.
*
* Since blocking on a locked mutex is very expensive
* check for a request without a lock first. If we miss
* a request that is just being delivered but this will
* cost at most one full walk through the list.
*/
if (next->xp_req_head) {
/*
* Check again, now with a lock.
*/
if (next->xp_req_head) {
return (next);
}
}
/*
* Continue to `walk' through the pool's
* transport list until there is less requests
* than walkers. Check this condition without
* a lock first to avoid contention on a mutex.
*/
/* Check again, now with the lock. */
break; /* goto sleep */
}
}
/*
* No work to do. Stop the `walk' and go asleep.
* Decrement the `walking-threads' count for the pool.
*/
/*
* Count us as asleep, mark this thread as safe
* for suspend and wait for a request.
*/
/*
* If the drowsy flag is on this means that
* someone has signaled a wakeup. In such a case
* the `asleep-threads' count has already updated
* so just clear the flag.
*
* If the drowsy flag is off then we need to update
* the `asleep-threads' count.
*/
/*
* If the thread is here because it timedout,
* instead of returning SVC_ETIMEDOUT, it is
* time to do some more work.
*/
if (timeleft == -1)
timeleft = 1;
} else {
}
/*
* If we received a signal while waiting for a
* request, inform svc_run(), so that we can return
* to user level and exit.
*/
if (timeleft == 0)
return (SVC_EINTR);
/*
* If the current transport is gone then notify
* svc_run() to unlink from it.
*/
return (SVC_EXPRTGONE);
/*
* If we have timed out waiting for a request inform
* svc_run() that we probably don't need this thread.
*/
if (timeleft == -1)
return (SVC_ETIMEDOUT);
}
}
/*
* calculate memory space used by message
*/
static size_t
{
return (count);
}
/*
* svc_flowcontrol() attempts to turn the flow control on or off for the
* transport.
*
* On input the xprt->xp_full determines whether the flow control is currently
* off (FALSE) or on (TRUE). If it is off we do tests to see whether we should
* turn it on, and vice versa.
*
* There are two conditions considered for the flow control. Both conditions
* have the low and the high watermark. Once the high watermark is reached in
* EITHER condition the flow control is turned on. For turning the flow
* control off BOTH conditions must be below the low watermark.
*
* Condition #1 - Number of requests queued:
*
* The max number of threads working on the pool is roughly pool->p_maxthreads.
* Every thread could handle up to pool->p_max_same_xprt requests from one
* transport before it moves to another transport. See svc_poll() for details.
* In case all threads in the pool are working on a transport they will handle
* no more than enough_reqs (pool->p_maxthreads * pool->p_max_same_xprt)
* requests in one shot from that transport. We are turning the flow control
* on once the high watermark is reached for a transport so that the underlying
* queue knows the rate of incoming requests is higher than we are able to
* handle.
*
* The high watermark: 2 * enough_reqs
* The low watermark: enough_reqs
*
*
* We want to prevent a particular pool exhausting the memory, so once the
* total length of queued requests for the whole pool reaches the high
* watermark we start to turn on the flow control for significant memory
* consumers (individual transports). To keep the implementation simple
* enough, this condition is not exact, because we count only the data part of
* the queued requests and we ignore the overhead. For our purposes this
* should be enough. We should also consider that up to pool->p_maxthreads
* threads for the pool might work on large requests (this is not counted for
* this condition). We need to leave some space for rest of the system and for
* other big memory consumers (like ZFS). Also, after the flow control is
* turned on (on cots transports) we can start to accumulate a few megabytes in
* queues for each transport.
*
* Usually, the big memory consumers are NFS WRITE requests, so we do not
* expect to see this condition met for other than NFS pools.
*
* The high watermark: 1/5 of available memory
* The low watermark: 1/6 of available memory
*
* Once the high watermark is reached we turn the flow control on only for
* transports exceeding a per-transport memory limit. The per-transport
* fraction of memory is calculated as:
*
* the high watermark / number of transports
*
* For transports with less than the per-transport fraction of memory consumed,
* the flow control is not turned on, so they are not blocked by a few "hungry"
* transports. Because of this, the total memory consumption for the
* particular pool might grow up to 2 * the high watermark.
*
* The individual transports are unblocked once their consumption is below:
*
* per-transport fraction of memory / 2
*
* or once the total memory consumption for the whole pool falls below the low
* watermark.
*
*/
static void
{
/* Should we turn the flow control on? */
/* Is flow control disabled? */
if (svc_flowcontrol_disable != 0)
return;
/* Is there enough requests queued? */
return;
}
/*
* If this pool uses over 20% of memory and this transport is
* significant memory consumer then we are full
*/
return;
}
/* We might want to turn the flow control off */
/* Do we still have enough requests? */
return;
/*
* If this pool still uses over 16% of memory and this transport is
* still significant memory consumer then we are still full
*/
return;
/* Turn the flow control off and make sure rpcmod is notified */
}
/*
* Main loop of the kernel RPC server
* - wait for input (find a transport with a pending request).
* - dequeue the request
* - call a registered server routine to process the requests
*
* There can many threads running concurrently in this loop
* on the same or on different transports.
*/
static int
{
/* Allocate a clone transport handle for this thread */
clone_xprt = svc_clone_init();
/*
* The loop iterates until the thread becomes
* idle too long or the transport is gone.
*/
for (;;) {
/*
* immediately without processing any more
* requests.
*/
return (EINTR);
}
/* Find a transport with a pending request */
/*
* If svc_poll() finds a transport with a request
* it latches xp_req_lock on it. Therefore we need
* to dequeue the request and release the lock as
* soon as possible.
*/
(next == SVC_EXPRTGONE ||
next == SVC_ETIMEDOUT ||
/* Ooops! Current transport is closing. Unlink now */
if (next == SVC_EXPRTGONE) {
continue;
}
/* Ooops! Timeout while waiting for a request. Exit */
if (next == SVC_ETIMEDOUT) {
return (0);
}
/*
* Interrupted by a signal while waiting for a
* request. Return to userspace and exit.
*/
return (EINTR);
}
/*
* De-queue the request and release the request lock
* on this transport (latched by svc_poll()).
*/
/*
* If this is a new request on a current transport then
* the clone structure is already properly initialized.
* Otherwise, if the request is on a different transport,
* unlink from the current master and link to
* the one we got a request on.
*/
if (xprt)
}
/*
* If there are more requests and req_cv hasn't
* been signaled yet then wake up one more thread now.
*
* We avoid signaling req_cv until the most recently
* signaled thread wakes up and gets CPU to clear
* the `drowsy' flag.
*/
else {
}
}
/*
* still below pool->p_maxthreads limit, and no thread is
* currently being created then signal the creator
* for one more service thread.
*
* The asleep and drowsy checks are not protected
* by a lock since it hurts performance and a wrong
* decision is not essential.
*/
/*
* Process the request.
*/
/* If thread had a reservation it should have been canceled */
/*
* If the clone is marked detached then exit.
* The rpcmod slot has already been released
* when we detached this thread.
*/
if (clone_xprt->xp_detached) {
return (0);
}
/*
* Release our reference on the rpcmod
* slot attached to xp_wq->q_ptr.
*/
if (enable)
}
/* NOTREACHED */
}
/*
* Flush any pending requests for the queue and
* free the associated mblks.
*/
void
{
/*
* clean up the requests
*/
/* remove the request from the list */
}
}
/*
* This routine is called by rpcmod to inform kernel RPC that a
* queue is closing. It is called after all the requests have been
* picked up (that is after all the slots on the queue have
* been released by kernel RPC). It is also guaranteed that no more
* request will be delivered on this transport.
*
* - clear xp_wq to mark the master server transport handle as closing
* later.
*/
void
{
/*
* If there is no master xprt associated with this stream,
* then there is nothing to do. This happens regularly
* with connection-oriented listening streams created by
* nfsd.
*/
return;
}
if (xprt->xp_threads == 0) {
/*
* svc_xprt_cleanup() destroys the transport
* or releases the transport thread lock
*/
/*
* If the pool is in closing state and this was
* the last transport in the pool then signal the creator
* thread to clean up and exit.
*/
return;
}
} else {
/*
* There are still some threads linked to the transport. They
* are very likely sleeping in svc_poll(). We could wake up
* them by broadcasting on the p_req_cv condition variable, but
* that might give us a performance penalty if there are too
* many sleeping threads.
*
* Instead, we do nothing here. The linked threads will unlink
* themselves and destroy the transport once they are woken up
* on timeout, or by new request. There is no reason to hurry
* up now with the thread wake up.
*/
/*
* NOTICE: No references to the master transport structure
* beyond this point!
*/
}
}
/*
* Interrupt `request delivery' routine called from rpcmod
* - put a request at the tail of the transport request queue
* - insert a hint for svc_poll() into the xprt-ready queue
* - increment the `pending-requests' count for the pool
* - handle flow control
* - wake up a thread sleeping in svc_poll() if necessary
* - if all the threads are running ask the creator for a new one.
*/
{
/*
* Step 1.
* Grab the transport's request lock and the
* pool's request lock so that when we put
* the request at the tail of the transport's
* request queue, possibly put the request on
* the xprt ready queue and increment the
* pending request count it looks atomic.
*/
return (FALSE);
}
else
/*
* Step 2.
* Insert a hint into the xprt-ready queue, increment
* counters, handle flow control, and wake up
* a thread sleeping in svc_poll() if necessary.
*/
/* Insert pointer to this transport into the xprt-ready queue */
/* Increment counters */
/* Handle flow control */
if (flowcontrol)
/*
* If there are more requests and req_cv hasn't
* been signaled yet then wake up one more thread now.
*
* We avoid signaling req_cv until the most recently
* signaled thread wakes up and gets CPU to clear
* the `drowsy' flag.
*/
} else {
/*
* Signal wakeup and drop the request lock.
*/
}
/*
* Step 3.
* still below pool->p_maxthreads limit, and no thread is
* currently being created then signal the creator
* for one more service thread.
*
* The asleep and drowsy checks are not not protected
* by a lock since it hurts performance and a wrong
* decision is not essential.
*/
"svc_queuereq_end:(%S)", "end");
return (TRUE);
}
/*
* Reserve a service thread so that it can be detached later.
* This reservation is required to make sure that when it tries to
* detach itself the total number of detached threads does not exceed
* pool->p_maxthreads - pool->p_redline (i.e. that we can have
* up to pool->p_redline non-detached threads).
*
* If the thread does not detach itself later, it should cancel the
* reservation before returning to svc_run().
*
* - if so, then increment the `reserved threads' count for the pool
* - mark the thread as reserved (setting the flag in the clone transport
* handle for this thread
* - returns 1 if the reservation succeeded, 0 if it failed.
*/
int
{
/* Recursive reservations are not allowed */
/* Check pool counts if there is room for reservation */
return (0);
}
/* Mark the thread (clone handle) as reserved */
return (1);
}
/*
* Cancel a reservation for a thread.
* - decrement the `reserved threads' count for the pool
* - clear the flag in the clone transport handle for this thread.
*/
void
{
/* Thread must have a reservation */
/* Decrement global count */
/* Clear reservation flag */
}
/*
* Detach a thread from its transport, so that it can block for an
* extended time. Because the transport can be closed after the thread is
* detached, the thread should have already sent off a reply if it was
* going to send one.
*
* - decrement `non-detached threads' count and increment `detached threads'
* counts for the transport
* - decrement the `non-detached threads' and `reserved threads'
* counts and increment the `detached threads' count for the pool
* - release the rpcmod slot
* - mark the clone (thread) as detached.
*
* No need to return a pointer to the thread's CPR information, since
* the thread has a userland identity.
*
* NOTICE: a thread must not detach itself without making a prior reservation
* through svc_thread_reserve().
*/
{
/* Thread must have a reservation */
/* Bookkeeping for this transport */
xprt->xp_threads--;
/* Bookkeeping for the pool */
/* Release an rpcmod slot for this request */
if (enable)
/* Mark the clone (thread) as detached */
return (NULL);
}
/*
* This routine is responsible for extracting RDMA plugin master XPRT,
* unregister from the SVCPOOL and initiate plugin specific cleanup.
* active in a given registered or unregistered kRPC thread pool. Its shuts
* all active rdma transports in that pool. If the thread active on the trasport
* happens to be last thread for that pool, it will signal the creater thread
* to cleanup the pool and destroy the xprt in svc_queueclose()
*/
void
{
queue_t *q;
int i, rtg_count;
if (rdma_xprts->rtg_count == 0)
return;
for (i = 0; i < rtg_count; i++) {
rdma_xprts->rtg_count--;
/* remove the request from the list */
}
svc_queueclose(q);
#ifdef DEBUG
if (rdma_check)
#endif
/*
* Free the rdma transport record for the expunged rdma
* based master transport handle.
*/
if (!rdma_xprts->rtg_listhead)
break;
}
}
/*
* Currently only used by svc_rpcsec_gss.c but put in this file as it
* may be useful to others in the future.
* But future consumers should be careful cuz so far
*/
struct rpc_msg *
{
/* dup opaque auth call body cred */
/* dup or just alloc opaque auth call body verifier */
} else {
}
return (dst);
return (NULL);
}
void
{
kmem_free(m, sizeof (*m));
m = NULL;
}