port.c revision 7c478bd95313f5f23a4c958a745db2134aa03244
1N/A * The contents of this file are subject to the terms of the 1N/A * Common Development and Distribution License, Version 1.0 only 1N/A * (the "License"). You may not use this file except in compliance 1N/A * See the License for the specific language governing permissions 1N/A * and limitations under the License. 1N/A * When distributing Covered Code, include this CDDL HEADER in each 1N/A * If applicable, add the following below this CDDL HEADER, with the 1N/A * fields enclosed by brackets "[]" replaced with your own identifying 1N/A * information: Portions Copyright [yyyy] [name of copyright owner] 1N/A * Copyright 2004 Sun Microsystems, Inc. All rights reserved. 1N/A * Use is subject to license terms. 1N/A#
pragma ident "%Z%%M% %I% %E% SMI" 1N/A * Event Ports can be shared across threads or across processes. 1N/A * can use a single port. A major request was also to get the ability 1N/A * to submit user-defined events to a port. The idea of the 1N/A * user-defined events is to use the event ports for communication between 1N/A * in a port with the same priority as other event types. 1N/A * for events with the "highest priority" (priority here is related to the 1N/A * internal strategy to wakeup waiting threads) will retrieve the event, 1N/A * the requirement to have events which should be submitted immediately 1N/A * to all "waiting" threads. That is the main task of the alert event. 1N/A * The alert event is submitted by the application to a port. The port 1N/A * changes from a standard mode to the alert mode. Now all waiting threads 1N/A * will be awaken immediately and they will return with the alert event. 1N/A * Threads trying to retrieve events from a port in alert mode will 1N/A * return immediately with the alert event. 1N/A * An event port is like a kernel queue, which accept events submitted from 1N/A * user level as well as events submitted from kernel sub-systems. Sub-systems 1N/A * able to submit events to a port are the so-called "event sources". 1N/A * Current event sources: 1N/A * PORT_SOURCE_AIO : events submitted per transaction completion from 1N/A * POSIX-I/O framework. 1N/A * PORT_SOURCE_TIMER : events submitted when a timer fires 1N/A * (see timer_create(3RT)). 1N/A * PORT_SOURCE_FD : events submitted per file descriptor (see poll(2)). 1N/A * PORT_SOURCE_ALERT : events submitted from user. This is not really a 1N/A * single event, this is actually a port mode 1N/A * (see port_alert(3c)). 1N/A * PORT_SOURCE_USER : events submitted by applications with 1N/A * port_send(3c) or port_sendn(3c). 1N/A * There is a user API implemented in the libc library as well as a 1N/A * The available user API functions are: 1N/A * port_create() : create a port as a file descriptor of portfs file system 1N/A * The standard close(2) function closes a port. 1N/A * port_associate() : associate a file descriptor with a port to be able to 1N/A * retrieve events from that file descriptor. 1N/A * port_dissociate(): remove the association of a file descriptor with a port. 1N/A * port_send() : send an event of type PORT_SOURCE_USER to a port 1N/A * port_sendn() : send an event of type PORT_SOURCE_USER to a list of ports 1N/A * port_get() : retrieve a single event from a port 1N/A * port_getn() : retrieve a list of events from a port 1N/A * The available kernel API functions are: 1N/A * port_init_event() : set event data in the event structure 1N/A * port_send_event() : send event to a port 1N/A * port_associate_ksource(): associate a kernel event source with a port 1N/A * port_dissociate_ksource(): dissociate a kernel event source from a port 1N/A * The libc implementation consists of small functions which pass the 1N/A * arguments to the kernel using the "portfs" system call. It means, all the 1N/A * synchronisation work is being done in the kernel. The "portfs" system 1N/A * call loads the portfs file system into the kernel. 1N/A * The first function to be used is port_create() which internally creates 1N/A * a vnode and a portfs node. The portfs node is represented by the port_t 1N/A * structure, which again includes all the data necessary to control a port. 1N/A * port_create() returns a file descriptor, which needs to be used in almost 1N/A * all other event port functions. 1N/A * The maximum number of ports per system is controlled by the resource 1N/A * control: project:port-max-ids. 1N/A * The second step is the triggering of events, which could be sent to a port. 1N/A * Every event source implements an own method to generate events for a port: 1N/A * The sigevent structure of the standard POSIX-IO functions 1N/A * was extended by an additional notification type. 1N/A * Standard notification types: 1N/A * SIGEV_NONE, SIGEV_SIGNAL and SIGEV_THREAD 1N/A * Event ports introduced now SIGEV_PORT. 1N/A * The notification type SIGEV_PORT specifies that a structure 1N/A * of type port_notify_t has to be attached to the sigev_value. 1N/A * The port_notify_t structure contains the event port file 1N/A * descriptor and a user-defined pointer. 1N/A * Internally the AIO implementation will use the kernel API 1N/A * functions to allocate an event port slot per transaction (aiocb) 1N/A * and sent the event to the port as soon as the transaction completes. 1N/A * All the events submitted per transaction are of type 1N/A * PORT_SOURCE_TIMER: 1N/A * The timer_create() function uses the same method as the 1N/A * PORT_SOURCE_AIO event source. It also uses the sigevent structure 1N/A * to deliver the port information. 1N/A * Internally the timer code will allocate a single event slot/struct 1N/A * per timer and it will send the timer event as soon as the timer 1N/A * fires. If the timer-fired event is not delivered to the application 1N/A * before the next period elapsed, then an overrun counter will be 1N/A * incremented. The timer event source uses a callback function to 1N/A * detect the delivery of the event to the application. At that time 1N/A * the timer callback function will update the event overrun counter. 1N/A * This event source uses the port_associate() function to allocate 1N/A * an event slot/struct from a port. The application defines in the 1N/A * events argument of port_associate() the type of events which it is 1N/A * The internal pollwakeup() function is used by all the file 1N/A * systems --which are supporting the VOP_POLL() interface- to notify 1N/A * the upper layer (poll(2), devpoll(7d) and now event ports) about 1N/A * the event triggered (see valid events in poll(2)). 1N/A * The pollwakeup() function forwards the event to the layer registered 1N/A * to receive the current event. 1N/A * The port_dissociate() function can be used to free the allocated 1N/A * event slot from the port. Anyway, file descriptors deliver events 1N/A * only one time and remain deactivated until the application 1N/A * reactivates the association of a file descriptor with port_associate(). 1N/A * If an associated file descriptor is closed then the file descriptor 1N/A * will be dissociated automatically from the port. 1N/A * PORT_SOURCE_ALERT: 1N/A * This event type is generated when the port was previously set in 1N/A * alert mode using the port_alert() function. 1N/A * A single alert event is delivered to every thread which tries to 1N/A * retrieve events from a port. 1N/A * This type of event is generated from user level using the port_send() 1N/A * function to send a user event to a port or the port_sendn() function 1N/A * to send an event to a list of ports. 1N/A * EVENT DELIVERY / RETRIEVING EVENTS 1N/A * Events remain in the port queue until: 1N/A * - the application uses port_get() or port_getn() to retrieve events, 1N/A * - the event source cancel the event, 1N/A * - the event port is closed or 1N/A * - the process exits. 1N/A * The maximal number of events in a port queue is the maximal number 1N/A * control: process.port-max-events. 1N/A * The port_get() function retrieves a single event and the port_getn() 1N/A * function retrieves a list of events. 1N/A * Events are classified as shareable and non-shareable events across processes. 1N/A * Non-shareable events are invisible for the port_get(n)() functions of 1N/A * processes other than the owner of the event. 1N/A * Shareable event types are: 1N/A * PORT_SOURCE_USER events 1N/A * This type of event is unconditionally shareable and without 1N/A * limitations. If the parent process sends a user event and closes 1N/A * the port afterwards, the event remains in the port and the child 1N/A * process will still be able to retrieve the user event. 1N/A * PORT_SOURCE_ALERT events 1N/A * This type of event is shareable between processes. 1N/A * Limitation: The alert mode of the port is removed if the owner 1N/A * (process which set the port in alert mode) of the 1N/A * alert event closes the port. 1N/A * PORT_SOURCE_FD events 1N/A * This type of event is conditional shareable between processes. 1N/A * After fork(2) all forked file descriptors are shareable between 1N/A * the processes. The child process is allowed to retrieve events 1N/A * from the associated file descriptors and it can also re-associate 1N/A * the fd with the port. * Limitations: The child process is not allowed to dissociate * the file descriptor from the port. Only the * owner (process) of the association is allowed to * dissociate the file descriptor from the port. * If the owner of the association closes the port * the association will be removed. * This type of event is not shareable between processes. * PORT_SOURCE_TIMER events * This type of event is not shareable between processes. * On fork(2) the child process inherits all opened file descriptors from * the parent process. This is also valid for port file descriptors. * Associated file descriptors with a port maintain the association across the * fork(2). It means, the child process gets full access to the port and * it can retrieve events from all common associated file descriptors. * Events of file descriptors created and associated with a port after the * fork(2) are non-shareable and can only be retrieved by the same process. * If the parent or the child process closes an exported port (using fork(2) * or I_SENDFD) all the file descriptors associated with the port by the * process will be dissociated from the port. Events of dissociated file * descriptors as well as all non-shareable events will be discarded. * The other process can continue working with the port as usual. * close(2) has to be used to close a port. See FORK BEHAVIOUR for details. * The global control structure of the event ports framework is port_control_t. * port_control_t keeps track of the number of created ports in the system. * The cache of the port event structures is also located in port_control_t. * On port_create() the vnode and the portfs node is also created. * The portfs node is represented by the port_t structure. * The port_t structure manages all port specific tasks: * - management of resource control values * - port VOP_POLL interface * - uid and gid of the port * The port_t structure contains the port_queue_t structure. * The port_queue_t structure contains all the data necessary for the * - submitted events (represented by port_kevent_t structures) * - threads waiting for event delivery (check portget_t structure) * - PORT_SOURCE_FD cache (managed by the port_fdcache_t structure) * - event source management (managed by the port_source_t structure) * - alert mode management (check port_alert_t structure) * The event port file system creates a kmem_cache for internal allocation of * 1. Event source association with a port: * The first step to do for event sources is to get associated with a port * using the port_associate_ksource() function or adding an entry to the * port_ksource_tab[]. An event source can get dissociated from a port * using the port_dissociate_ksource() function. An entry in the * port_ksource_tab[] implies that the source will be associated * automatically with every new created port. * The event source can deliver a callback function, which is used by the * port to notify the event source about close(2). The idea is that * in such a case the event source should free all allocated resources * The port_close() function will wait until all allocated event * The callback function is not necessary when the event source does not * maintain local resources, a second condition is that the event source * can guarantee that allocated event slots will be returned without * delay to the port (it will not block and sleep somewhere). * 2. Reservation of an event slot / event structure * The event port reliability is based on the reservation of an event "slot" * (allocation of an event structure) by the event source as part of the * application call. If the maximal number of event slots is exhausted then * the event source can return a corresponding error code to the application. * The port_alloc_event() function has to be used by event sources to * allocate an event slot (reserve an event structure). The port_alloc_event() * doesn not block and it will return a 0 value on success or an error code * An argument of port_alloc_event() is a flag which determines the behavior * of the event after it was delivered to the application: * PORT_ALLOC_DEFAULT : event slot becomes free after delivery to the * PORT_ALLOC_PRIVATE : event slot remains under the control of the event * source. This kind of slots can not be used for * event delivery and should only be used internally * PORT_KEV_CACHED : event slot remains under the control of an event * port cache. It does not become free after delivery * PORT_ALLOC_SCACHED : event slot remains under the control of the event * source. The event source takes the control over * the slot after the event is delivered to the * 3. Delivery of events to the event port * Earlier allocated event structure/slot has to be used to deliver * event data to the port. Event source has to use the function * port_send_event(). The single argument is a pointer to the previously * The portkev_events field of the port_kevent_t structure can be updated/set * 1. using the port_set_event() function, or * 2. updating the portkev_events field out of the callback function: * The event source can deliver a callback function to the port as an * argument of port_init_event(). * One of the arguments of the callback function is a pointer to the * events field, which will be delivered to the application. * (see Delivery of events to the application). * they remain blocked until the data is delivered to the application and the * slot becomes free or it is delivered back to the event source * (PORT_ALLOC_SCACHED). The activation of the callback function mentioned above * is at the same time the indicator for the event source that the event * 4. Delivery of events to the application * port queue until they are retrieved by the application or the port * is closed (exit(2) also closes all opened file descriptors).. * The application uses port_get() or port_getn() to retrieve events from * a port. port_get() retrieves a single event structure/slot and port_getn() * Both functions are able to poll for events and return immediately or they * can specify a timeout value. * Before the events are delivered to the application they are moved to a * second temporary internal queue. The idea is to avoid lock collisions or * contentions of the global queue lock. * The global queue lock is used every time when an event source delivers * new events to the port. * The port_get() and port_getn() functions * a) retrieve single events from the temporary queue, * b) prepare the data to be passed to the application memory, * c) activate the callback function of the event sources: * - to get the latest event data, * - the event source can free all allocated resources associated with the * - the event source can deny the delivery of the event to the application * (e.g. because of the wrong process). * d) put the event back to the temporary queue if the event delivery was denied * e) repeat a) until d) as long as there are events in the queue and * there is enough user space available. * The loop described above could block for a very long time the global mutex, * to avoid that a second mutex was introduced to synchronized concurrent * threads accessing the temporary queue. "32-bit event ports syscalls",
#
endif /* _SYSCALL32_IMPL */ * This table contains a list of event sources which need a static * association with a port (every port). * The last NULL entry in the table is required to detect "end of table". /* create kmem_cache for port event structures */ * System call wrapper for all port related system calls from 32-bit programs. #
endif /* _SYSCALL32_IMPL */ * System entry point for port functions. * a0 is a port file descriptor (except for PORT_SENDN and PORT_CREATE). * The libc uses PORT_SYS_NOPORT in functions which do not deliver a * port file descriptor as first argument. /* opcodes using port as first argument (a0) */ /* see PORT_GETN description */ * port_getn() can only retrieve own or shareable events from * other processes. The port_getn() function remains in the * kernel until own or shareable events are available or the /* currently only PORT_SOURCE_FD is implemented */ /* user-defined events */ * library events, blocking * Only events of type PORT_SOURCE_AIO are currently allowed. /* currently only PORT_SOURCE_FD is implemented */ if ((
int)
a2)
/* a2 = events */ * System call to create a port. * The port_create() function creates a vnode of type VPORT per port. * The port control data is associated with the vnode as vnode private data. * The port_create() function returns an event port file descriptor. /* initialize vnode and port private data */ * Retrieve the maximal number of event ports allowed per system from * the resource control: project.port-max-ids. * Retrieve the maximal number of events allowed per port from * the resource control: process.port-max-events. /* allocate a new user file descriptor and a file structure */ * If the file table is full, free allocated resources. /* set user file pointer */ /* initializes port private data */ * port_init() initializes event port specific data * If it is not enough memory available to satisfy a user * request using a single port_getn() call then port_getn() * will reduce the size of the list to PORT_MAX_LIST. /* Set timestamp entries required for fstat(2) requests */ /* initialize port queue structs */ /* Allocate cache skeleton for PORT_SOURCE_FD events */ * Allocate cache skeleton for association of event sources. * pre-associate some kernel sources with this port. * The pre-association is required to create port_source_t * structures for object association. * Some sources can not get associated with a port before the first * object association is requested. Another reason to pre_associate * a particular source with a port is because of performance. * The port_add_ksource_local() function is being used to associate * event sources with every new port. * The event sources need to be added to port_ksource_tab[]. /* associate new source with the port */ * The port_send() function sends an event of type "source" to a * port. This function is non-blocking. An event can be sent to * a port as long as the number of events per port does not achieve the * maximal allowed number of events. The max. number of events per port is * defined by the resource control process.max-port-events. * This function is used by the port library function port_send() * and port_dispatch(). The port_send(3c) function is part of the * event ports API and submits events of type PORT_SOURCE_USER. The * port_dispatch() function is project private and it is used by library * functions to submit events of other types than PORT_SOURCE_USER * (e.g. PORT_SOURCE_AIO). * The port_noshare() function returns 0 if the current event was generated * by the same process. Otherwise is returns a value other than 0 and the * event should not be delivered to the current processe. * The port_noshare() function is normally used by the port_dispatch() * function. The port_dispatch() function is project private and can only be * used within the event port project. * Currently the libaio uses the port_dispatch() function to deliver events * of types PORT_SOURCE_AIO. * The port_dispatch_event() function is project private and it is used by * libraries involved in the project to deliver events to the port. * port_dispatch will sleep and wait for enough resources to satisfy the * The library can specify if the delivered event is shareable with other * processes (see PORT_SYS_NOSHARE flag). * The port_sendn() function is the kernel implementation of the event * port API function port_sendn(3c). * This function is able to send an event to a list of event ports. * Scan the list for event port file descriptors and send the * attached user event data embedded in a event of type * PORT_SOURCE_USER to every event port in the list. * If a list entry is not a valid event port then the corresponding * error code will be stored in the errors[] list with the same * list offset as in the ports[] list. * The port_alert() funcion is a high priority event and it is always set * on top of the queue. It is also delivered as single event. * - SET :overwrite current alert data * - UPDATE:set alert data or return EBUSY if alert mode is already set * - wakeup all sleeping threads /* check alert conditions */ * Store alert data in the port to be delivered to threads * which are using port_get(n) to retrieve events. /* alert and deliver alert data to waiting threads */ /* no threads waiting for events */ * Set waiting threads in alert mode (PORTGET_ALERT).. * Every thread waiting for events already allocated a portget_t * The port alert arguments are stored in the portget_t structure. * The PORTGET_ALERT flag is set to indicate the thread to return * immediately with the alert event. * Clear alert state of the port * The port_getn() function is used to retrieve events from a port. * The port_getn() function returns immediately if there are enough events * available in the port to satisfy the request or if the port is in alert * mode (see port_alert(3c)). * The timeout argument of port_getn(3c) -which is embedded in the * port_gettimer_t structure- specifies if the system call should block or if it * should return immediately depending on the number of events available. * This function is internally used by port_getn(3c) as well as by * Return number of objects with events * The portq_block_mutex is required to synchronize this * thread with another possible thread, which could be * retrieving events from the port queue. * Check if a second thread is currently retrieving events * and it is using the temporary event queue. /* put remaining events back to the port queue */ if (*
nget == 0) {
/* no events required */ /* port is being closed ... */ /* return immediately if port in alert mode */ * Now check if the completed events satisfy the * "wait" requirements of the current thread: * loop entry of same thread * pgt_loop is set when the current thread returns * prematurely from this function. That could happen * when a port is being shared between processes and * this thread could not find events to return. * It is not allowed to a thread to retrieve non-shareable * events generated in other processes. * PORTQ_WAIT_EVENTS is set when a thread already * checked the current event queue and no new events * are added to the queue. /* some new events arrived ...check them */ /* check if enough events are available ... */ * There are not enough events available to satisfy * the request, check timeout value and wait for if (
blocking == 0)
/* don't block, check fired events */ /* enqueue thread in the list of waiting threads */ /* Wait here until return conditions met */ /* reap alert event and return */ * Check if some other thread is already retrieving * events (portq_getn > 0). /* take thread out of the wait queue */ /* return without events */ * Move port event queue to a temporary event queue . * New incoming events will be continue be posted to the event queue * and they will not be considered by the current thread. * of the port queue mutex. The contention and performance degradation * a) incoming events use the port queue mutex to enqueue new events and * b) before the event can be delivered to the application it is * necessary to notify the event sources about the event delivery. * Sometimes the event sources can require a long time to return and * the queue mutex would block incoming events. * During this time incoming events (port_send_event()) do not need * to awake threads waiting for events. Before the current thread * returns it will check the conditions to awake other waiting threads. * Move remaining events from previous thread back to the /* move port event queue to a temporary queue */ lev =
NULL;
/* start with first event in the queue */ if (
pev ==
NULL)
/* no more events available */ /* move event data to copyout list */ * Event can not be delivered to the lev =
pev;
/* last checked event */ lev =
NULL;
/* start with first event in the queue */ if (
pev ==
NULL)
/* no more events available */ /* move event data to copyout list */ * Event can not be delivered to the lev =
pev;
/* last checked event */ #
endif /* _SYSCALL32_IMPL */ * Remember number of remaining events in the temporary event queue. * Work to do before return : * - push list of remaining events back to the top of the standard * - if this is the last thread calling port_get(n) then wakeup the * thread waiting on close(2). * - check for a deferred cv_signal from port_send_event() and wakeup * move remaining events in the temporary event queue back * to the port event queue /* Last thread => check close(2) conditions ... */ /* do not copyout events */ * no other threads retrieving events ... * check wakeup conditions of sleeping threads * Check PORTQ_POLLIN here because the current thread set temporarily * the number of events in the queue to zero. /* now copyout list of user event structures to user space */ /* no events retrieved: check loop conditions */ /* timeout already checked -> remember values */ /* set number of user event structures completed */ * 1. copy kernel event structure to user event structure. * 2. PORT_KEV_WIRED event structures will be reused by the "source" * 3. Remove PORT_KEV_DONEQ flag (event removed from the event queue) * 4. Other types of event structures can be delivered back to the port cache * (port_free_event_local()). * 5. The event source callback function is the last opportunity for the * event source to update events, to free local resources associated with * the event or to deny the delivery of the event. /* remove event from the queue */ * Events of type PORT_KEV_WIRED remain allocated by the * Event can not be delivered. * Caller must reinsert the event into the queue. * 1. copy kernel event structure to user event structure. * 2. PORT_KEV_WIRED event structures will be reused by the "source" * 3. Remove PORT_KEV_DONEQ flag (event removed from the event queue) * 4. Other types of event structures can be delivered back to the port cache * (port_free_event_local()). * 5. The event source callback function is the last opportunity for the * event source to update events, to free local resources associated with * the event or to deny the delivery of the event. /* remove event from the queue */ * Events if type PORT_KEV_WIRED remain allocated by the * Event can not be delivered. * Caller must reinsert the event into the queue. #
endif /* _SYSCALL32_IMPL */ /* copyout alert event structures to user space */ #
endif /* _SYSCALL32_IMPL */ * Check return conditions : * - pending port close(2) * - threads waiting for events * The port_get_kevent() function returns * - the event located at the head of the queue if 'last' pointer is NULL * - the next event after the event pointed by 'last' * The caller of this function is responsible for the integrity of the queue * - port_getn() is using a temporary queue protected with * portq->portq_block_mutex * - port_close_events() is working on the global event queue and protects the * queue with portq->portq_mutex. * The port_get_timeout() function gets the timeout data from user space * and converts that info into a corresponding internal representation. * The kerneldata flag means that the timeout data is already loaded. #
endif /* _SYSCALL32_IMPL */ * Threads requiring more events than available will be put in a wait queue. * There is a "thread wait queue" per port. * Threads requiring less events get a higher priority than others and they /* first waiting thread */ * thread waiting for less events will be set on top of the queue. /* add thread to the queue */ * Take thread out of the queue. /* last (single) waiting thread */ * Set up event port kstats.