udp.c revision 8347601bcb0a439f6e50fc36b4039a73d08700e1
* +---------------+---------------+ * +---------------+---------------+ * | ip_wq | ip_rq | conn_t * +---------------+---------------+ * Messages arriving at udp_wq from above will end up in ip_wq before * it gets processed, i.e. udp write entry points will advance udp_wq * and use its q_next value as ip_wq in order to use the conn_t that * is stored in its q_ptr. Likewise, messages generated by ip to the * module above udp will appear as if they are originated from udp_rq, * i.e. putnext() calls to the module above udp is done using the * udp_rq instead of ip_rq in order to avoid udp_rput() which does * nothing more than calling putnext(). * The above implies the following rule of thumb: * 1. udp_t is obtained from conn_t, which is created by the /dev/ip * instance and is stored in q_ptr of both ip_wq and ip_rq. There * is no direct reference to conn_t from either udp_wq or udp_rq. * 2. Write-side entry points of udp can obtain the conn_t via the * Q_TO_CONN() macro, using the queue value obtain from UDP_WR(). * 3. While in /dev/ip context, putnext() to the module above udp can * be done by supplying the queue value obtained from UDP_RD(). * Bind hash list size and hash function. It has to be a power of 2 for /* UDP bind fanout hash structure. */ /* udp_fanout_t *udp_bind_fanout. */ * This controls the rate some ndd info report functions can be used * by non-privileged users. It stores the last time such info is * requested. When those report functions are called again, this * is checked with the current time and compare with the ndd param * udp_ndd_get_info_interval. "ndd get info rate too high for non-privileged users, try again " \
/* Option processing attrs */ /* Support for just SNMP if UDP is not pushed directly over device IP */ /* Hint not protected by any lock */ * Extra privileged ports. In host byte order. /* Only modified during _init and _fini thus no locking is needed. */ static IDP udp_g_nd;
/* Points to table of UDP ND variables. */ /* MIB-2 stuff for SNMP */ /* Default structure copied into T_INFO_ACK messages */ T_INVALID,
/* ETSU_size. udp does not support expedited data. */ T_INVALID,
/* CDATA_size. udp does not support connect data. */ T_INVALID,
/* DDATA_size. udp does not support disconnect data. */ sizeof (
sin_t),
/* ADDR_size. */ 0,
/* OPT_size - not initialized here */ T_CLTS,
/* SERV_type. udp supports connection-less. */ TS_UNBND,
/* CURRENT_state. This is set from udp_state. */ T_INVALID,
/* ETSU_size. udp does not support expedited data. */ T_INVALID,
/* CDATA_size. udp does not support connect data. */ T_INVALID,
/* DDATA_size. udp does not support disconnect data. */ sizeof (
sin6_t),
/* ADDR_size. */ 0,
/* OPT_size - not initialized here */ T_CLTS,
/* SERV_type. udp supports connection-less. */ TS_UNBND,
/* CURRENT_state. This is set from udp_state. */ /* largest UDP port number */ * Table of ND variables supported by udp. These are loaded into udp_g_nd * All of these are alterable, within the min/max values given, at run time. { 0L,
256,
32,
"udp_wroff_extra" },
{
1L,
255,
255,
"udp_ipv4_ttl" },
{
1024, (
32 *
1024),
1024,
"udp_smallest_nonpriv_port" },
{ 0,
1,
1,
"udp_do_checksum" },
{
1024,
UDP_MAX_PORT, (
32 *
1024),
"udp_smallest_anon_port" },
{
65536, (
1<<
30),
2*
1024*
1024,
"udp_max_buf"},
{
100,
60000,
1000,
"udp_ndd_get_info_interval"},
* The smallest anonymous port in the privileged port range which UDP * looks for free port. Use in the option UDP_ANONPRIVBIND. /* If set to 0, pick ephemeral port sequentially; otherwise randomly. */ * Hook functions to enable cluster networking. * On non-clustered systems these vectors must always be NULL * Notes on UDP endpoint synchronization: * UDP needs exclusive operation on a per endpoint basis, when executing * functions that modify the endpoint state. udp_rput_other() deals with * packets with IP options, and processing these packets end up having * to update the endpoint's option related state. udp_wput_other() deals * with control operations from the top, e.g. connect() that needs to * update the endpoint state. These could be synchronized using locks, * but the current version uses squeues for this purpose. squeues may * give performance improvement for certain cases such as connected UDP * sockets; thus the framework allows for using squeues. * The perimeter routines are described as follows: * Enter the UDP endpoint perimeter. * Become exclusive on the UDP endpoint. Specifies a function * that will be called exclusively either immediately or later * when the perimeter is available exclusively. * Exit the UDP perimeter. * Entering UDP from the top or from the bottom must be done using * udp_enter(). No lock must be held while attempting to enter the UDP * perimeter. When finished, udp_exit() must be called to get out of * UDP operates in either MT_HOT mode or in SQUEUE mode. In MT_HOT mode, * multiple threads may enter a UDP endpoint concurrently. This is used * for sending and/or receiving normal data. Control operations and other * special cases call udp_become_writer() to become exclusive on a per * endpoint basis and this results in transitioning to SQUEUE mode. squeue * by definition serializes access to the conn_t. When there are no more * pending messages on the squeue for the UDP connection, the endpoint * reverts to MT_HOT mode. During the interregnum when not all MT threads * of an endpoint have finished, messages are queued in the UDP endpoint * and the UDP is in UDP_MT_QUEUED mode or UDP_QUEUED_SQUEUE mode. * These modes have the following analogs: * UDP_MT_QUEUED RW_WRITE_WANTED * UDP_SQUEUE or UDP_QUEUED_SQUEUE RW_WRITE_LOCKED * Stable modes: UDP_MT_HOT, UDP_SQUEUE * Transient modes: UDP_MT_QUEUED, UDP_QUEUED_SQUEUE * While in stable modes, UDP keeps track of the number of threads * operating on the endpoint. The udp_reader_count variable represents * the number of threads entering the endpoint as readers while it is * in UDP_MT_HOT mode. Transitioning to UDP_SQUEUE happens when there * is only a single reader, i.e. when this counter drops to 1. Likewise, * udp_squeue_count represents the number of threads operating on the * endpoint's squeue while it is in UDP_SQUEUE mode. The mode transition * to UDP_MT_HOT happens after the last thread exits the endpoint, i.e. * when this counter drops to 0. * The default mode is set to UDP_MT_HOT and UDP alternates between * UDP_MT_HOT and UDP_SQUEUE as shown in the state transition below. * ---------------------------------------------------------------- * old mode Event New mode * ---------------------------------------------------------------- * UDP_MT_HOT Call to udp_become_writer() UDP_SQUEUE * and udp_reader_count == 1 * UDP_MT_HOT Call to udp_become_writer() UDP_MT_QUEUED * and udp_reader_count > 1 * UDP_MT_QUEUED udp_reader_count drops to zero UDP_QUEUED_SQUEUE * UDP_QUEUED_SQUEUE All messages enqueued on the UDP_SQUEUE * internal UDP queue successfully * moved to squeue AND udp_squeue_count != 0 * UDP_QUEUED_SQUEUE All messages enqueued on the UDP_MT_HOT * internal UDP queue successfully * moved to squeue AND udp_squeue_count * UDP_SQUEUE udp_squeue_count drops to zero UDP_MT_HOT * ---------------------------------------------------------------- /* Context of udp_mode_assertions */ * Messages have not yet been enqueued on the internal queue, * otherwise we would have switched to UDP_MT_QUEUED. Likewise * by definition, there can't be any messages enqueued on the * squeue. The UDP could be quiescent, so udp_reader_count * could be zero at entry. * The last MT thread to exit the udp perimeter empties the * internal queue and then switches the UDP to * UDP_QUEUED_SQUEUE mode. Since we are still in UDP_MT_QUEUED * mode, it means there must be at least 1 MT thread still in * the perimeter and at least 1 message on the internal queue. * The switch has happened from MT to SQUEUE. So there can't * any MT threads. Messages could still pile up on the internal * queue until the transition is complete and we move to * UDP_SQUEUE mode. We can't assert on nonzero udp_squeue_count * since the squeue could drain any time. * The transition is complete. Thre can't be any messages on * the internal queue. The udp could be quiescent or the squeue * could drain any time, so we can't assert on nonzero * udp_squeue_count during entry. Nor can we assert that * udp_reader_count is zero, since, a reader thread could have * directly become writer in line by calling udp_become_writer * without going through the queued states. /* We can execute as reader right away. */ \
* We are in squeue mode, send the \ * Some messages may have been enqueued \ * ahead of us. Enqueue the new message \ * at the tail of the internal queue to \ * preserve message ordering. \ * We are the only MT thread. Switch to squeue mode /* Enqueue the packet internally in UDP */ * We are already exclusive. i.e. we are already * writer. Simply call the desired function. * Transition from MT mode to SQUEUE mode, when the last MT thread * is exiting the UDP perimeter. Move all messages from the internal * udp queue to the squeue. A better way would be to move all the * messages in one shot, this needs more support from the squeue framework * It is best not to hold any locks across the calls * to squeue functions. Since we drop the lock we * need to go back and check the udp_mphead once again * after the squeue_fill and hence the while loop at * the top of this function * udp_squeue_count of zero implies that the squeue has drained * even before we arrived here (i.e. after the squeue_fill above) * If this is the last MT thread, we need to \ * switch to squeue mode \ * Even if the udp_squeue_count drops to zero, we \ * don't want to change udp_mode to UDP_MT_HOT here. \ * The thread in udp_switch_to_squeue will take care \ * of the transition to UDP_MT_HOT, after emptying \ * any more new messages that have been enqueued in \ * Return the next anonymous port in the privileged port range for * Trusted Extension (TX) notes: TX allows administrator to mark or * reserve ports as Multilevel ports (MLP). MLP has special function * on TX systems. Once a port is made MLP, it's not available as * ordinary port. This creates "holes" in the port name space. It * may be necessary to skip the "holes" find a suitable anon port. /* UDP bind hash report triggered via the Named Dispatch mechanism. */ /* Refer to comments in udp_status_report(). */ /* The following may work even if we cannot get a large buf. */ " zone lport src addr dest addr port state");
/* 1234 12345 xxx.xxx.xxx.xxx xxx.xxx.xxx.xxx 12345 UNBOUND */ /* Print the hash index. */ /* skip to first entry in this zone; might be none */ * Hash list removal routine for udp_t structures. * Extract the lock pointer in case there are concurrent * hash_remove's for this instance. * If the new udp bound to the INADDR_ANY address * and the first one in the list is not bound to * INADDR_ANY we skip all entries until we find the * first one bound to INADDR_ANY. * This makes sure that applications binding to a * specific address get preference over those binding to * It associates a port number and local address with the stream. * protocol type (IPPROTO_UDP) placed in the message following the address. * A T_BIND_ACK message is passed upstream when ip acknowledges the request. * Note that UDP over IPv4 and IPv6 sockets can use the same port number * without setting SO_REUSEADDR. This is needed so that they * can be viewed as two independent transport protocols. * However, anonymouns ports are allocated from the same range to avoid * duplicating the udp_g_next_port_to_try. "udp_bind: bad req, len %u",
* Reallocate the message to make sure we have enough room for an * address and the protocol type. case 0:
/* Request for a generic port */ case sizeof (
sin_t):
/* Complete IPv4 address */ case sizeof (
sin6_t):
/* complete IPv6 address */ default:
/* Invalid request */ else /* T_BIND_REQ and requested_port != 0 */ * If the application passed in zero for the port number, it * doesn't care which port number we bind to. Get one in the * If the port is in the well-known privileged range, * make sure the caller was privileged. * Copy the source address into our udp structure. This address * may still be zero; if so, IP will fill in the correct address * each time an outbound packet is passed to it. * If udp_reuseaddr is not set, then we have to make sure that * the IP address and port number the application requested * (or we selected for the application) is not being used by * another stream. If another stream is already using the * requested IP address and port, the behavior depends on * "bind_to_req_port_only". If set the bind fails; otherwise we * search for any an unused port to bind to the the stream. * As per the BSD semantics, as modified by the Deering multicast * changes, if udp_reuseaddr is set, then we allow multiple binds * to the same port independent of the local IP address. * This is slightly different than in SunOS 4.X which did not * support IP multicast. Note that the change implemented by the * Deering multicast code effects all binds - not only binding * to IP multicast addresses. * Note that when binding to port zero we ignore SO_REUSEADDR in * order to guarantee a unique port. /* loopmax = (IPPORT_RESERVED-1) - udp_min_anonpriv_port + 1 */ * Walk through the list of udp streams bound to * requested port with the same IP address. * On a labeled system, we must treat bindings to ports * on shared IP addresses by sockets with MAC exemption * privilege as being in all zones, as there's * otherwise no way to identify the right receiver. * If UDP_EXCLBIND is set for either the bound or * binding endpoint, the semantics of bind * is changed according to the following chart. * spec = specified address (v4 or v6) * unspec = unspecified address (v4 or v6) * A = specified addresses are different for endpoints * ------------------------------------- * For labeled systems, SO_MAC_EXEMPT behaves the same * as UDP_EXCLBIND, except that zoneid is ignored. * Check ipversion to allow IPv4 and IPv6 sockets to * have disjoint port number spaces. * On the first time through the loop, if the * the user intentionally specified a * particular port number, then ignore any * bindings of the other protocol that may * conflict. This allows the user to bind IPv6 * alone and get both v4 and v6, or bind both * both and get each seperately. On subsequent * times through the loop, we're checking a * port that we chose (not the user) and thus * we do not allow casual duplicate bindings. * No difference depending on SO_REUSEADDR. * If existing port is bound to a * non-wildcard IP address and * the requesting stream is bound to * a distinct different IP addresses * (non-wildcard, also), keep going. * No other stream has this IP address * and port number. We can use it. * We get here only when requested port * is bound (and only first of the for() * The semantics of this bind request * require it to fail so we return from * the routine (and exit the loop). * If the application wants us to find * a port, get one to start with. Set * requested_port to 0, so that we will * update udp_g_next_port_to_try below. * We've tried every possible port number and * there are none available, so send an error * Copy the source address into our udp structure. This address * may still be zero; if so, ip will fill in the correct address * each time an outbound packet is passed to it. * If we are binding to a broadcast or multicast address udp_rput * will clear the source address when it receives the T_BIND_ACK. * Now reset the the next anonymous port if the application requested * an anonymous port, or we handed out the next anonymous port. /* Rebuild the header template */ * Running in cluster mode - register bind information "udp_bind: no priv for multilevel port %d",
* If we're specifically binding a shared IP address and the * port is MLP on shared addresses, then check to see if this * zone actually owns the MLP. Reject if not. "udp_bind: attempt to bind port " "%d on shared addr in zone %d " "udp_bind: cannot establish anon " "MLP for port %d",
port);
/* Pass the protocol number in the message following the address. */ * Append a request for an IRE if udp_v6src not * zero (IPv4 - INADDR_ANY, or IPv6 - all-zeroes address). * This is called from ip_wput_nondata to resume a deferred UDP bind. * This routine handles each T_CONN_REQ message passed to udp. It * associates a default destination address with the stream. * This routine sends down a T_BIND_REQ to IP with the following mblks: * T_BIND_REQ - specifying local and remote address/port * IRE_DB_REQ_TYPE - to get an IRE back containing ire_type and src * T_OK_ACK - for the T_CONN_REQ * T_CONN_CON - to keep the TPI user happy * The connect completes in udp_rput. * When a T_BIND_ACK is received information is extracted from the IRE * and the two appended messages are sent to the TPI user. * Should udp_rput receive T_ERROR_ACK for the T_BIND_REQ it will convert * it to an error ack for the appropriate primitive. /* A bit of sanity checking */ * This UDP must have bound to a port already before doing /* Already connected - clear out state */ * Determine packet type based on type of address passed in * the request should contain an IPv4 or IPv6 address. * Make sure that address family matches the type of * family of the the address passed down * Create a default IP header with no IP options. * Interpret a zero destination to mean loopback. * Update the T_CONN_REQ (sin/sin6) since it is used to * generate the T_CONN_CON. * If the destination address is multicast and * an outgoing multicast interface has been set, * use the address of that interface as our * source address if no source address has been set. * Interpret a zero destination to mean loopback. * Update the T_CONN_REQ (sin/sin6) since it is used to * generate the T_CONN_CON. * If the destination address is multicast and * an outgoing multicast interface has been set, * then the ip bind logic will pick the correct source * address (i.e. matching the outgoing multicast interface). * connections in TS_DATA_XFER * Send down bind to IP to verify that there is a route * and to determine the source address. * This will come back as T_BIND_ACK with an IRE_DB_TYPE in rput. * We also have to send a connection confirmation to * keep TLI happy. Prepare it for udp_rput. /* Unable to reuse the T_CONN_REQ for the ack. */ /* Hang onto the T_OK_ACK and T_CONN_CON for later. */ * Disable read-side synchronous stream * interface and drain any queued data. /* restore IP module's high and low water marks to default values */ * Restore connp as an IP endpoint. * Locking required to prevent a race with udp_snmp_get()/ * ipcl_get_next_conn(), which selects conn_t which are * IPCL_UDP and not CONN_CONDEMNED. * Called in the close path from IP (ip_quiesce_conn) to quiesce the conn * Running in cluster mode - register unbind information /* If there are any options associated with the stream, free them. */ /* Free memory associated with sticky options */ * This routine handles each T_DISCON_REQ message passed to udp * as an indicating that UDP is no longer connected. This results * in sending a T_BIND_REQ to IP to restore the binding to just * This routine sends down a T_BIND_REQ to IP with the following mblks: * T_OK_ACK - for the T_DISCON_REQ * The disconnect completes in udp_rput. * When a T_BIND_ACK is received the appended T_OK_ACK is sent to the TPI user. * Should udp_rput receive T_ERROR_ACK for the T_BIND_REQ it will convert * it to an error ack for the appropriate primitive. * Send down bind to IP to remove the full binding and revert * to the local address binding. /* Unable to reuse the T_DISCON_REQ for the ack. */ /* Rebuild the header template */ /* Append the T_OK_ACK to the T_BIND_REQ for udp_rput */ /* This routine creates a T_ERROR_ACK message and passes it upstream. */ /* Shorthand to generate and send TPI error acks to our client */ * Fail the request if the new value does not lie within the /* Check if the value is already in the list */ * Fail the request if the new value does not lie within the /* Check that the value is already in the list */ /* At minimum we need 4 bytes of UDP header */ * udp_icmp_error is called by udp_rput to process ICMP msgs. passed up by IP. * Generates the appropriate T_UDERROR_IND for permanent (non-transient) errors. * Assumes that IP has pulled up everything up to and including the ICMP header. * An M_CTL could potentially come here from some other module (i.e. if UDP * is pushed on some module other than IP). Thus, if we find that the M_CTL * does not have enough ICMP information , following STREAMS conventions, * we send it upstream assuming it is an M_CTL we don't understand. * Assume IP provides aligned packets - otherwise toss * Verify that we have a complete IP header and the application has * asked for errors. If not, send it upstream. * Verify IP version. Anything other than IPv4 or IPv6 packet is sent * upstream. ICMPv6 is handled in udp_icmp_error_ipv6. /* Skip past the outer IP and ICMP headers */ * If we don't have the correct outer IP header length or if the ULP * is not IPPROTO_ICMP or if we don't have a complete inner IP header * send the packet upstream. /* Skip past the inner IP and find the ULP header */ * If we don't have the correct inner IP header length or if the ULP * is not IPPROTO_UDP or if we don't have at least ICMP_MIN_UDP_HDR * bytes of UDP header, send it upstream. * IP has already adjusted the path MTU. * XXX Somehow pass MTU indication to application? * udp_icmp_error_ipv6 is called by udp_icmp_error to process ICMP for IPv6. * Generates the appropriate T_UDERROR_IND for permanent (non-transient) errors. * Assumes that IP has pulled up all the extension headers as well as the * An M_CTL could potentially come here from some other module (i.e. if UDP * is pushed on some module other than IP). Thus, if we find that the M_CTL * does not have enough ICMP information , following STREAMS conventions, * we send it upstream assuming it is an M_CTL we don't understand. The reason * it might get here is if the non-ICMP M_CTL accidently has 6 in the version * field (when cast to ipha_t in udp_icmp_error). * Verify that we have a complete IP header. If not, send it upstream. * Verify this is an ICMPV6 packet, else send it upstream * Verify we have a complete ICMP and inner IP header. * Validate inner header. If the ULP is not IPPROTO_UDP or if we don't * have at least ICMP_MIN_UDP_HDR bytes of UDP header send the * If the application has requested to receive path mtu * information, send up an empty message containing an * IPV6_PATHMTU ancillary data item. * newmp->b_cont is left to NULL on purpose. This is an * empty message containing only ancillary data. * We've consumed everything we need from the original * message. Free it, then send our empty message. /* If this corresponds to an ICMP_PROTOCOL_UNREACHABLE */ * This routine responds to T_ADDR_REQ messages. It is called by udp_wput. * The local address is filled in if endpoint is bound. The remote address * is filled in if remote address has been precified ("connected endpoint") * (The concept of connected CLTS sockets is alien to published TPI * but we support it anyway). /* Make it large enough for worst case */ * Note: Following code assumes 32 bit alignment of basic * data structures like sin_t and struct T_addr_ack. * Fill in local address first /* Fill zeroes and then initialize non-zero fields */ * udp_v6src is not set, we might be bound to * local address instead (that could * also still be INADDR_ANY) * connected, fill remote address too /* assumed 32-bit alignment */ /* Fill zeroes and then initialize non-zero fields */ * udp_v6src is not set, we might be bound to * local address instead (that could * also still be UNSPECIFIED) * connected, fill remote address too /* assumed 32-bit alignment */ * This routine responds to T_CAPABILITY_REQ messages. It is called by * udp_wput. Much of the T_CAPABILITY_ACK information is copied from * udp_g_t_info_ack. The current state of the stream is copied from * This routine responds to T_INFO_REQ messages. It is called by udp_wput. * Most of the T_INFO_ACK information is copied from udp_g_t_info_ack. * The current state of the stream is copied from udp_state. /* Create a T_INFO_ACK message. */ * IP recognizes seven kinds of bind requests: * - A zero-length address binds only to the protocol number. * - A 4-byte address is treated as a request to * validate that the address is a valid local IPv4 * address, appropriate for an application to bind to. * IP does the verification, but does not make any note * of the address at this time. * - A 16-byte address contains is treated as a request * to validate a local IPv6 address, as the 4-byte * - A 16-byte sockaddr_in to validate the local IPv4 address and also * use it for the inbound fanout of packets. * - A 24-byte sockaddr_in6 to validate the local IPv6 address and also * use it for the inbound fanout of packets. * - A 12-byte address (ipa_conn_t) containing complete IPv4 fanout * information consisting of local and remote addresses * and ports. In this case, the addresses are both * validated as appropriate for this operation, and, if * so, the information is retained for use in the * - A 36-byte address address (ipa6_conn_t) containing complete IPv6 * fanout information, like the 12-byte case above. * IP will also fill in the IRE request mblk with information * regarding our peer. In all cases, we notify IP of our protocol * type by appending a single protocol byte to the bind request. /* Append a request for an IRE */ /* cp known to be 32 bit aligned */ /* Append a request for an IRE */ /* cp known to be 32 bit aligned */ /* Append a request for an IRE */ /* Append a request for an IRE */ /* Add protocol number to end */ * This is the open routine for udp. It allocates a udp_t structure for * the stream and, on the first open of the module, creates an ND table. /* If the stream is already open, return immediately. */ /* If this is not a push of udp as a module, fail. */ /* Insert ourselves in the stream since we're about to walk q_next */ * UDP is supported only as a module and it has to be pushed directly * above the device instance of IP. If UDP is pushed anywhere else * on a stream, it will support just T_SVR4_OPTMGMT_REQ for the * sake of MIB browsers and fail everything else. /* Support just SNMP for MIB browsers */ * Initialize the udp_t structure for this stream. /* Set the initial state of the stream and the privilege status. */ * If the caller has the process-wide flag set, then default to MAC * exempt mode. This allows read-down to unlabeled hosts. * The transmit hiwat/lowat is only looked at on IP's queue. /* Build initial header template for transmit */ /* Set the Stream head write offset and high watermark. */ * Which UDP options OK to set through T_UNITDATA_REQ... * This routine gets default values of certain options whose default * values are maintained by protcol specific code * This routine retrieves the current status of socket options * and expects the caller to pass in the queue pointer of the * upper instance. It returns the size of the option retrieved. break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ * The following three items are available here, * but are only meaningful to IP. break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ /* Handled at IP level */ /* 0 address if not set */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ /* cannot "get" the value for these */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ /* cannot "get" the value for these */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ break;
/* goto sizeof (int) option return */ /* XXX assumes that caller has room for max size! */ break;
/* goto sizeof (int) option return */ * User is not usually aware of this option. * We copy out the hbh opt after the label option. * This routine sets socket options; it expects the caller * to pass in the queue pointer of the upper instance. * Note: Implies T_CHECK semantics for T_OPTCOM_REQ * inlen != 0 implies value supplied and * we have to "pretend" to set it. * inlen == 0 implies that there is no * value part in T_CHECK request and just validation * done elsewhere should be enough, we just return here. * Negotiating local and "association-related" options * through T_UNITDATA_REQ. * Following routine can filter out ones we do not * want to be "set" this way. * We should never get here * For fixed length options, no sanity check * of passed in length is done. It is assumed *_optcom_req() * routines do the right thing. * The following three items are available here, * but are only meaningful to IP. * "soft" error (negative) * option not handled at this level * Do not modify *outlenp. * Only sockets that have proper privileges and are * bound to MLPs will have any other value here, so * this implicitly tests for privilege to set label. /* Save options for use by IP. */ * TODO should check OPTMGMT reply and undo this if * "soft" error (negative) * option not handled at this level * Do not modify *outlenp. * Deal with both sticky options and ancillary data /* sticky options, or none */ /* -1 means use default */ /* Pass modified value to IP. */ /* Rebuild the header template */ /* -1 means use default */ /* Pass modified value to IP. */ if (*
i1 != 0 && *
i1 !=
1) {
* "soft" error (negative) * option not handled at this level * Note: Do not modify *outlenp * Set boolean switches for ancillary data delivery * Set sticky options or ancillary data. * If sticky options, (re)build any extension headers * that might be needed as a result. * The source address and ifindex are verified * in ip_opt_set(). For ancillary data the * source address is checked in ip_wput_v6. if (*
i1 >
255 || *
i1 < -
1)
if (*
i1 >
255 || *
i1 < -
1)
* IP will verify that the nexthop is reachable * and fail for sticky options. * Sanity checks - minimum size, size a multiple of * eight bytes, and matching size passed in. * Sanity checks - minimum size, size a multiple of * eight bytes, and matching size passed in. * Sanity checks - minimum size, size a multiple of * eight bytes, and matching size passed in. * Sanity checks - minimum size, size a multiple of * eight bytes, and matching size passed in. if (
inlen !=
sizeof (
int))
/* Handled at the IP level */ * Common case of OK return with outval same as inval. /* don't trust bcopy for identical src/dst */ * Update udp_sticky_hdrs based on udp_sticky_ipp, udp_v6src, and udp_ttl. * The headers include ip6i_t (if needed), ip6_t, any sticky extension * headers, and the udp header. * Returns failure if can't allocate memory. /* Set header fields not in ipp */ /* Try to get everything in a single mblk */ * This routine retrieves the value of an ND variable in a udpparam_t * structure. It is called through nd_getset when a user reads the * Walk through the param array specified registering each element with the * named dispatch (ND) handler. /* This routine sets an ND variable in a udpparam_t structure. */ * Fail the request if the new value does not lie within the * Copy hop-by-hop option from ipp->ipp_hopopts to the buffer provided (with * T_opthdr) and return the number of bytes copied. 'dbuf' may be NULL to * just count the length needed for allocation. If 'dbuf' is non-NULL, * then it's assumed to be allocated to be large enough. * Returns zero if trimming of the security option causes all options to go * If labeling is enabled, then skip the label option * but get other options if there are any. /* will fill in ip6h_len later */ * This loop finds the first (lastpad pointer) of any number of * pads that preceeds the security option, then treats the * security option as though it were a pad, and then finds the * next non-pad option (or end of list). * It then treats the entire block as one big pad. To preserve * alignment of any options that follow, or just the end of the * list, it computes a minimal new padding size that keeps the * same alignment for the next option. * If it encounters just a sequence of pads with no security * option, those are copied as-is rather than collapsed. * Note that to handle the end of list case, the code makes one * loop with 'hol' set to zero. /* if nothing was copied at all, then delete */ /* last pass; pick up any trailing padding */ * compute aligning effect of deleted material /* if there's uncopied padding, then copy that now */ /* go back and patch up the length value, rounded upward */ int udi_size;
/* Size of T_unitdata_ind */ "udp_rput_start: q %p mp %p", q,
mp);
* IP should have prepended the options data in an M_CTL * Check M_CTL "type" to make sure are not here bcos of * IP_RECVIF or IP_RECVSLLA information has been * appended to the packet by IP. We need to * extract the mblk and adjust the rptr "udp_rput_end: q %p (%S)", q,
"m_ctl");
* This is the inbound data path. * First, we check to make sure the IP version number is correct, * and then pull the IP and UDP headers into the first mblk. * Assume IP provides aligned packets - otherwise toss. * Also, check if we have a complete IP header. /* Initialize regardless if ipversion is IPv4 or IPv6 */ * Handle IPv4 packets with options outside of the * main data path. Not needed for AF_INET6 sockets * since they don't support a getsockopt of IP_OPTIONS. * UDP length check performed for IPv4 packets with * options to check whether UDP length specified in * the header is the same as the physical length of * Handle the case where the packet has IP options * and the IP_RECVSLLA & IP_RECVIF are set "udp_rput_end: q %p (%S)", q,
"end");
/* Handle IPV6_RECVHOPLIMIT. */ * IPv6 packets can only be received by applications * that are prepared to receive IPv6 addresses. * The IP fanout must ensure this. /* Look for ifindex information */ * Find any potentially interesting extension headers * as well as the length of the IPv6 + extension * IP inspected the UDP header thus all of it must be in the mblk. * UDP length check is performed for IPv6 packets and IPv4 packets * without options to check if the size of the packet as specified * by the header is the same as the physical size of the packet. /* Walk past the headers. */ * This is the inbound data path. Packets are passed upstream as * T_UNITDATA_IND messages with full IP headers still attached. * Normally only send up the address. * If IP_RECVDSTADDR is set we include the destination IP * address as an option. With IP_RECVOPTS we include all * the IP options. Only ip_rput_other() handles packets * that contain IP options. * If the IP_RECVSLLA or the IP_RECVIF is set then allocate * If SO_TIMESTAMP is set allocate the appropriate sized * buffer. Since gethrestime() expects a pointer aligned * argument, we allocate space necessary for extra * alignment (even though it might not be used). * If IP_RECVTTL is set allocate the appropriate sized buffer /* Allocate a message block for the T_UNITDATA_IND structure. */ "udp_rput_end: q %p (%S)", q,
"allocbfail");
* Add options if IP_RECVDSTADDR, IP_RECVIF, IP_RECVSLLA or * IP_RECVTTL has been set. * Copy in destination address before options to avoid /* Align for gethrestime() */ * Processing of IP_RECVTTL option * should always be the last. Adding * any option processing after this will /* Consumed all of allocated space */ * Handle both IPv4 and IPv6 packets for IPv6 sockets. * Normally we only send up the address. If receiving of any * optional receive side information is enabled, we also send * [ Only udp_rput_other() handles packets that contain IP * options so code to account for does not appear immediately "udp_rput_end: q %p (%S)", q,
"allocbfail");
/* No sin6_flowinfo per API */ /* For link-scope source pass up scope id */ /* Consumed all of allocated space */ /* No IP_RECVDSTADDR for IPv6. */ "udp_rput_end: q %p (%S)", q,
"end");
* There is nothing above us except for the stream head; * use the read-side synchronous stream interface in * order to reduce the time spent in interrupt thread. * Use regular STREAMS interface to pass data upstream * if this is not a socket endpoint, or if we have * switched over to the slow mode due to sockmod being * popped or a module being pushed on top of us. * Process non-M_DATA messages as well as M_DATA messages that requires * modifications to udp_ip_rcv_options i.e. IPv4 packets with IP options. int udi_size;
/* Size of T_unitdata_ind */ int opt_len;
/* Length of IP options */ "udp_rput_other: q %p mp %p", q,
mp);
* We are here only if IP_RECVSLLA and/or IP_RECVIF are set * The actual data is in mp->b_cont * M_DATA messages contain IPv4 datagrams. They are handled /* M_PROTO messages contain some type of TPI message. */ "udp_rput_other_end: q %p (%S)", q,
"protoshort");
* clear out the associated port and source * address before passing the message * upstream. If this was caused by a T_CONN_REQ * revert back to bound state. /* Revert back to the bound source */ * This is the inbound data path. * First, we make sure the data contains both IP and UDP headers. * This handle IPv4 packets for only AF_INET sockets. * AF_INET6 sockets can never access udp_ip_rcv_options thus there * is no need saving the options. "udp_rput_other_end: q %p (%S)", q,
"hdrshort");
/* Walk past the headers. */ /* Save the options if any */ /* Adjust length if we are resusing the space */ * Normally only send up the address. * If IP_RECVDSTADDR is set we include the destination IP * address as an option. With IP_RECVOPTS we include all * If the IP_RECVSLLA or the IP_RECVIF is set then allocate * If IP_RECVTTL is set allocate the appropriate sized buffer /* Allocate a message block for the T_UNITDATA_IND structure. */ "udp_rput_other_end: q %p (%S)", q,
"allocbfail");
* Add options if IP_RECVDSTADDR, IP_RECVIF, IP_RECVSLLA or * IP_RECVTTL has been set. * Copy in destination address before options to avoid any "udp_rput_other_end: q %p (%S)", q,
"end");
* There is nothing above us except for the stream head; * use the read-side synchronous stream interface in * order to reduce the time spent in interrupt thread. * Use regular STREAMS interface to pass data upstream * if this is not a socket endpoint, or if we have * switched over to the slow mode due to sockmod being * popped or a module being pushed on top of us. * the source address to 0. * This ensures no datagrams with broadcast address * as source address are emitted (which would violate * RFC1122 - Hosts requirements) * Note that when connecting the returned IRE is * for the destination address and we only perform * the broadcast check for the source address (it * Note: we get IRE_BROADCAST for IPv6 to "mark" a multicast /* This was just a local bind to a broadcast addr */ * Local address not yet set - pick it from the * Look for one or more appended ACK message added by * udp_connect or udp_disconnect. * If none found just send up the T_BIND_ACK. * udp_connect has appended a T_OK_ACK and a T_CONN_CON. * udp_disconnect has appended a T_OK_ACK. * return SNMP stuff in buffer in mpdata /* fixed length structure for IPv4 and IPv6 counters */ * Note that the port numbers are sent in * Create an IPv4 table entry for IPv4 entries and also * any IPv6 entries which are bound to in6addr_any * If in6addr_any this will set it to * Can potentially get here for * v6 socket if another process * (say, ping) has just done a * sendto(), changing the state * from the TS_IDLE above to * TS_DATA_XFER by the time we hit /* table of MLP attributes... */ /* table of MLP attributes... */ * Return 0 if invalid set request, 1 otherwise, including non-udp requests. * NOTE: Per MIB-II, UDP has no writable data. * TODO: If this ever actually tries to set anything, it needs to be * to do the appropriate locking. /* Report for ndd "udp_status" */ * Because of the ndd constraint, at most we can have 64K buffer * to put in all UDP info. So to be more efficient, just * allocate a 64K buffer here, assuming we need that large buffer. * This may be a problem as any user can read udp_status. Therefore * we limit the rate of doing this using udp_ndd_get_info_interval. * This should be OK as normal users should not do this too often. /* The following may work even if we cannot get a large buf. */ " zone lport src addr dest addr port state");
/* 1234 12345 xxx.xxx.xxx.xxx xxx.xxx.xxx.xxx 12345 UNBOUND */ * This routine creates a T_UDERROR_IND message and passes it upstream. * The address and options are copied from the T_UNITDATA_REQ message * passed in mp. This message is freed. * This routine removes a port number association from a stream. It * is called by udp_wput to handle T_UNBIND_REQ messages. /* If a bind has not been done, we can't unbind. */ * Running in cluster mode - register unbind information /* Rebuild the header template */ * Pass the unbind to IP; T_UNBIND_REQ is larger than T_OK_ACK * and therefore ip_unbind must never return NULL. * Don't let port fall into the privileged range. * Since the extra privileged ports can be arbitrary we also * ensure that we exclude those from consideration. * udp_g_epriv_ports is not sorted thus we loop over it until * Unless changed by a sys admin, the smallest anon port * is 32768 and the largest anon port is 65535. It is * very likely (50%) for the random port to be smaller * than the smallest anon port. When that happens, * add port % (anon port range) to the smallest anon * port to get the random port. It should fall into the * Make sure that the port is in the char *,
"queue(1) failed to update options(2) on mp(3)",
* If options passed in, feed it for verification and handling * Note: success in processing options. * mp option buffer represented by * and contain option setting results /* mp1 points to the M_DATA mblk carrying the packet */ * Check if our saved options are valid; update if not * TSOL Note: Since we are not in WRITER mode, UDP packets * to different destination may require different labels. * We use conn_lock to ensure that lastdst, ip_snd_options, * and ip_snd_options_len are consistent for the current * destination and are updated atomically. /* Using UDP MLP requires SCM_UCRED from user */ char *,
"MLP mp(1) lacks SCM_UCRED attr(2) on q(3)",
"udp_wput_end: q %p (%S)", q,
"allocbfail2");
/* Set version, header length, and tos */ /* Set ttl and protocol */ /* Set version, header length, and tos */ /* Set ttl and protocol */ * Copy our address into the packet. If this is zero, * first look at __sin6_src_id for a hint. If we leave the source * as INADDR_ANY then ip will fill in the real source address. /* Determine length of packet */ * If the size of the packet is greater than the maximum allowed by * ip, return an error. Passing this down could cause panics because * the size will have wrapped and be inconsistent with the msg size. "udp_wput_end: q %p (%S)", q,
"IP length exceeded");
* Copy in the destination address * Set ttl based on IP_MULTICAST_TTL to match IPv6 logic. * Massage source route putting first source route in ipha_dst. * Ignore the destination in T_unitdata_req. * Create a checksum adjustment for a source route, if any. * IP does the checksum if uha_checksum is non-zero, * We make it easy for IP to include our pseudo header * by putting our length in uha_checksum. /* There might be a carry. */ * IP does the checksum if uha_checksum is non-zero, * We make it easy for IP to include our pseudo header * by putting our length in uha_checksum. /* Set UDP length and checksum */ /* mp has been consumed and we'll return success */ /* We're done. Pass the packet to ip. */ "udp_wput_end: q %p (%S)", q,
"end");
/* Release the old ire */ * We can continue to use the ire but since it was not * cached, we should drop the extra reference. * Check if we can take the fast-path. * Note that "incomplete" ire's (where the link-layer for next hop * is not resolved, or where the fast-path header in nce_fp_mp is not * available yet) are sent down the legacy (slow) path * If the service thread is already running, or if the driver * queue is currently flow-controlled, queue this packet. /* pseudo-header checksum (do it in parts for IP header checksum) */ /* Calculate IP header checksum if hardware isn't capable */ /* If multicast TTL is 0 then we are done */ * Send the packet directly to DLD, where it may be queued * depending on the availability of transmit resources at char *,
"queue(1) failed to update options(2) on mp(3)",
* This routine handles all messages passed downstream. It either * consumes the message or passes it downstream; it never queues a "udp_wput_start: connp %p mp %p",
connp,
mp);
* We directly handle several cases here: T_UNITDATA_REQ message * connected and non-connected socket. The latter carries the * address structure along when this routine gets called. /* Not connected; address is required */ "udp_wput_end: connp %p (%S)",
connp,
"not-connected; address required");
/* Not connected; do some more checks below */ /* M_DATA for connected socket */ /* Initialize addr and addrlen as if they're passed in */ * Handle both AF_INET and AF_INET6; the latter * for IPV4 mapped destination addresses. Note * here that both addr and addrlen point to the * corresponding struct depending on the address /* Handle valid T_UNITDATA_REQ here */ "udp_wput_end: q %p (%S)", q,
"badaddr");
"udp_wput_end: q %p (%S)", q,
"badaddr");
* If a port has not been bound to the stream, fail. * This is not a problem when sockfs is directly * above us, because it will ensure that the socket * is first bound before allowing data to be sent. "udp_wput_end: q %p (%S)", q,
"outstate");
"udp_wput_end: q %p (%S)", q,
"badaddr");
* Destination is a non-IPv4-compatible IPv6 address. * Send out an IPv6 format packet. "udp_wput_end: q %p (%S)", q,
"udp_output_v6");
* If the local address is not zero or a mapped address * return an error. It would be possible to send an IPv4 * packet but the response would never make it back to the * application since it is bound to a non-mapped address. "udp_wput_end: q %p (%S)", q,
"badaddr");
/* Send IPv4 packet without modifying udp_ipversion */ /* Extract port and ipaddr */ "udp_wput_end: q %p (%S)", q,
"badaddr");
/* Extract port and ipaddr */ /* mp is freed by the following routine */ * Allocate and prepare a T_UNITDATA_REQ message. * Entry point for sockfs when udp is in "direct sockfs" mode. This mode * is valid when we are directly beneath the stream head, and thus sockfs * is able to bypass STREAMS and directly call us, passing along the sockaddr * structure without the cumbersome T_UNITDATA_REQ interface. Note that * this is done for both connected and non-connected endpoint. /* udpsockfs should only send down M_DATA for this entry point */ * We can't enter this conn right away because another * thread is currently executing as writer; therefore we * need to deposit the message into the squeue to be * drained later. If a socket address is present, we * need to create a T_UNITDATA_REQ message as placeholder. /* Tag the packet with T_UNITDATA_REQ */ /* We can execute as reader right away. */ * Assumes that udp_wput did some sanity checking on the destination * If the local address is a mapped address return * It would be possible to send an IPv6 packet but the * response would never make it back to the application * since it is bound to a mapped address. * If TPI options passed in, feed it for verification and handling /* mp1 points to the M_DATA mblk carrying the packet */ * IPPF_SCOPE_ID is special. It's neither a sticky * option nor ancillary data. It needs to be * explicitly set in options_exists. * Compute the destination address * If we're not going to the same destination as last time, then * recompute the label required. This is done in a separate routine to * avoid blowing up our stack here. * TSOL Note: Since we are not in WRITER mode, UDP packets * to different destination may require different labels. * We use conn_lock to ensure that lastdst, sticky ipp_hopopts, * and sticky ipp_hopoptslen are consistent for the current * destination and are updated atomically. /* Using UDP MLP requires SCM_UCRED from user */ char *,
"MLP mp(1) lacks SCM_UCRED attr(2) on q(3)",
* If there's a security label here, then we ignore any options the * user may try to set. We keep the peer's label as a hidden sticky * option. We make a private copy of this label before releasing the * lock so that label is kept consistent with the destination addr. /* No sticky options nor ancillary data. */ * Go through the options figuring out where each is going to * come from and build two masks. The first mask indicates if * the option exists at all. The second mask indicates if the * option is sticky or ancillary. /* IPV6_HOPLIMIT can never be sticky */ * If any options carried in the ip6i_t were specified, we * need to account for the ip6i_t in the data we'll be sending /* check/fix buffer config, setup pointers into it */ /* Try to get everything in a single mblk next time */ /* sin6_scope_id takes precendence over IPPF_IFINDEX */ * Enable per-packet source address verification if * IPV6_PKTINFO specified the source address. * ip6_src is set in the transport's _wput function. * tell IP this is an ip6i_t private header /* Initialize IPv6 header */ /* Set the hoplimit of the outgoing packet. */ /* IPV6_HOPLIMIT ancillary data overrides all other settings. */ * The source address was not set using IPV6_PKTINFO. * First look at the bound source. * If unspecified fallback to __sin6_src_id. * Here's where we have to start stringing together * any extension headers in the right order: * Hop-by-hop, destination, routing, and final destination opts. * En-route destination options * Only do them if there's a routing header as well * Do ultimate destination options * Now set the last header pointer to the proto passed in * Copy in the destination address * Perform any processing needed for source routing. * We know that all extension headers will be in the same mblk * Drop packet - only support Type 0 routing. * Notify the application as well. * rth->ip6r_len is twice the number of * addresses in the header. Thus it must be even. * Shuffle the routing header and ip6_dst * addresses, and get the checksum difference * between the first hop (in ip6_dst) and * the destination (in the last routing hdr entry). * Verify that the first hop isn't a mapped address. * Routers along the path need to do this verification /* count up length of UDP packet */ * If the size of the packet is greater than the maximum allowed by * ip, return an error. Passing this down could cause panics because * the size will have wrapped and be inconsistent with the msg size. /* Store the UDP length. Subtract length of extension hdrs */ * We make it easy for IP to include our pseudo header * by putting our length in uh_checksum, modified (if * we have a routing header) by the checksum difference * between the ultimate destination and first hop addresses. * Note: UDP over IPv6 must always checksum the packet. /* mp has been consumed and we'll return success */ /* We're done. Pass the packet to IP */ "udp_wput_other_start: q %p", q);
"udp_wput_other_end: q %p (%S)",
"udp_wput_other_end: q %p (%S)", q,
"addrreq");
"udp_wput_other_end: q %p (%S)", q,
"bindreq");
"udp_wput_other_end: q %p (%S)", q,
"connreq");
"udp_wput_other_end: q %p (%S)", q,
"capabreq");
"udp_wput_other_end: q %p (%S)", q,
"inforeq");
* If a T_UNITDATA_REQ gets here, the address must * be bad. Valid T_UNITDATA_REQs are handled "udp_wput_other_end: q %p (%S)",
"udp_wput_other_end: q %p (%S)", q,
"unbindreq");
* Use upper queue for option processing in * case the request is not handled at this * level and needs to be passed down to IP. "udp_wput_other_end: q %p (%S)",
* Use upper queue for option processing in * case the request is not handled at this * level and needs to be passed down to IP. "udp_wput_other_end: q %p (%S)",
"udp_wput_other_end: q %p (%S)",
/* The following TPI message is not supported by udp. */ "udp_wput_other_end: q %p (%S)",
/* The following 3 TPI messages are illegal for udp. */ "udp_wput_other_end: q %p (%S)",
* If a default destination address has not * been associated with the stream, then we * don't know the peer's name. "udp_wput_other_end: q %p (%S)",
* For TI_GETPEERNAME and TI_GETMYNAME, we first * need to copyin the user's strbuf structure. * Processing will continue in the M_IOCDATA case "udp_wput_other_end: q %p (%S)",
/* nd_getset performs the necessary checking */ "udp_wput_other_end: q %p (%S)",
* Either sockmod is about to be popped and the * socket would now be treated as a plain stream, * or a module is about to be pushed so we could * no longer use read-side synchronous stream. * Drain any queued data and disable direct sockfs * Disable read-side synchronous * stream interface and drain any "udp_wput_other_end: q %p (%S)", q,
"iocdata");
/* Unrecognized messages are passed through without change. */ "udp_wput_other_end: q %p (%S)", q,
"end");
* udp_wput_iocdata is called by udp_wput_other to handle all M_IOCDATA /* Make sure it is one of ours. */ * The address has been copied out, so now * The address and strbuf have been copied out. * We're done, so just acknowledge the original * Something strange has happened, so acknowledge * the original M_IOCTL with an EPROTO error. * Now we have the strbuf structure for TI_GETMYNAME * and TI_GETPEERNAME. Next we copyout the requested * address and then we'll copyout the strbuf. * udp_v6src is not set, we might be bound to * local address instead (that could * also still be INADDR_ANY) /* udp->udp_family == AF_INET6 */ * udp_v6src is not set, we might be bound to * local address instead (that could * also still be UNSPECIFIED) /* udp->udp_family == AF_INET6) */ /* udp->udp_family == AF_INET6 */ /* Copy out the address */ * Use upper queue for option processing since the callback * routines expect to be called in UDP instance instead of IP. * Note: No special action needed in this * module for "is_absreq_failure" return (-
1);
/* failure */ return (0);
/* success */ /* Not a power of two. Round up to nearest power of two */ for (i = 0; i <
31; i++) {
* We get here whenever we do qreply() from IP, * i.e as part of handlings ioctls, etc. * Read-side synchronous stream info entry point, called as a * result of handling certain STREAMS ioctl operations. /* If shutdown on read has happened, return nothing */ * Return the number of messages. * Return size of all data messages. * Return size of first data message. * Return data contents of first message. * Read-side synchronous stream entry point. This is called as a result * of recv/read operation done at sockfs, and is guaranteed to execute * outside of the interrupt thread context. It returns a single datagram * (b_cont chain of T_UNITDATA_IND plus data) to the upper layer. /* We should never get here when we're in SNMP mode */ * Dequeue datagram from the head of the list and return * it to caller; also ensure that RSLEEP sd_wakeq flag is * set/cleared depending on whether or not there's data /* Last datagram in the list? */ /* No longer flow-controlling? */ * Either we just dequeued the last datagram or * we get here from sockfs and have nothing to * return; in this case clear RSLEEP. * More data follows; we need udp_rrw() to be * called in future to pick up the rest. * Enqueue a completely-built T_UNITDATA_IND message into the receive * list; this is typically executed within the interrupt thread context * and so we do things as quickly as possible. * Wake up and signal the receiving app; it is okay to do this * before enqueueing the mp because we are holding the drain lock. * One of the advantages of synchronous stream is the ability for * us to find out when the application performs a read on the * socket by way of udp_rrw() entry point being called. We need * to generate SIGPOLL/SIGIO for each received data in the case * of asynchronous socket just as in the strrput() case. However, * we only wake the application up when necessary, i.e. during the * first enqueue. When udp_rrw() is called, we send up a single * datagram upstream and call STR_WAKEUP_SET() again when there * are still data remaining in our receive queue. /* Need to flow-control? */ /* Update poll events and send SIGPOLL/SIGIO if necessary */ * Drain the contents of receive list to the module upstream; we do * this during close or when we fallback to the slow mode due to * sockmod being popped or a module being pushed on top of us. * There is no race with a concurrent udp_input() sending * up packets using putnext() after we have cleared the * udp_direct_sockfs flag but before we have completed * sending up the packets in udp_rcv_list, since we are * either a writer or we have quiesced the conn. * Send up everything via putnext(); note here that we * don't need the udp_drain_lock to protect us since * nothing can enter udp_rrw() and that we currently * have exclusive access to this udp. /* We add a bit of extra buffering */ * Little helper for IPsec's NAT-T processing.