ip_rts.c revision 6e91bba0d6c6bdabbba62cefae583715a4a58e2a
2N/A * Copyright 2010 Sun Microsystems, Inc. All rights reserved. 2N/A * Use is subject to license terms. 2N/A * Copyright (c) 1988, 1991, 1993 2N/A * The Regents of the University of California. All rights reserved. 2N/A * Redistribution and use in source and binary forms, with or without 2N/A * modification, are permitted provided that the following conditions 2N/A * 1. Redistributions of source code must retain the above copyright 2N/A * notice, this list of conditions and the following disclaimer. 2N/A * 2. Redistributions in binary form must reproduce the above copyright 2N/A * notice, this list of conditions and the following disclaimer in the 2N/A * documentation and/or other materials provided with the distribution. 2N/A * 3. All advertising materials mentioning features or use of this software 2N/A * must display the following acknowledgement: 2N/A * This product includes software developed by the University of 2N/A * California, Berkeley and its contributors. 2N/A * 4. Neither the name of the University nor the names of its contributors 2N/A * may be used to endorse or promote products derived from this software 2N/A * without specific prior written permission. 2N/A * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 2N/A * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 2N/A * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 2N/A * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 2N/A * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 2N/A * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 2N/A * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 2N/A * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 2N/A * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 2N/A * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 2N/A * This file contains routines that processes routing socket requests. 2N/A * Send `mp' to all eligible routing queues. A queue is ineligible if: 2N/A * 1. SO_USELOOPBACK is off and it is not the originating queue. 2N/A * 2. RTA_UNDER_IPMP is on and RTSQ_UNDER_IPMP is not set in `flags'. 2N/A * 3. RTA_UNDER_IPMP is off and RTSQ_NORMAL is not set in `flags'. 2N/A * 4. It is not the same address family as `af', and `af' isn't AF_UNSPEC. 2N/A * Since we don't have an ill_t here, RTSQ_DEFAULT must already be 2N/A * resolved to one or more of RTSQ_NORMAL|RTSQ_UNDER_IPMP at this point. 2N/A * If there was a family specified when this routing socket was 2N/A * created and it doesn't match the family of the message to 2N/A * copy, then continue. 2N/A * Queue the message only if the conn_t and flags match. 2N/A * For the originating queue, we only copy the message upstream 2N/A * if loopback is set. For others reading on the routing 2N/A * socket, we check if there is room upstream for a copy of the 2N/A /* Pass to rts_input */ 2N/A /* Note that we pass a NULL ira to rts_input */ 2N/A /* reload next_connp since conn_next may have changed */ 2N/A * Takes an ire and sends an ack to all the routing sockets. This 2N/A * - when a stale redirect is deleted * This is a call from the RTS module * indicating that this is a Routing Socket * Stream. Insert this conn_t in routing * This is a call from the RTS module indicating that it is closing. * Processes requests received on a routing socket. It extracts all the * arguments and calls the appropriate function to process the request. * RTA_SRC bit flag requests are sent by 'route -setsrc'. * In general, this function does not consume the message supplied but rather * sends the message upstream with an appropriate UNIX errno. * Check the routing message for basic consistency including the * version number and that the number of octets written is the same * as specified by the rtm_msglen field. * At this point, an error can be delivered back via rtm_errno. /* Only allow RTM_GET or RTM_RESOLVE for unprivileged process */ * Based on the address family of the destination address, determine * the destination, gateway and netmask and return the appropriate error * if an unknown address family was specified (following the errno * values that 4.4BSD-Lite2 returns.) * These errno values are meant to be compatible with * 4.4BSD-Lite2 for the given message types. * At this point, the address family must be something known. * Since all interfaces in an IPMP group must be equivalent, * we prevent changes to a specific underlying interface's * routing configuration. However, for backward compatibility, * we intepret a request to add a route on an underlying * interface as a request to add a route on its IPMP interface. ill =
NULL;
/* already refrele'd */ * This provides the same zoneid as in Solaris 10 * that -ifp picks the zoneid from the first ipif on the ill. * But it might not be useful since the first ipif will always * have the same zoneid as the ill. * If a netmask was supplied in the message, then subsequent route * lookups will attempt to match on the netmask as well. * We only process any passed-in route security attributes for * either RTM_ADD or RTM_CHANGE message; We overload them * to do an RTM_GET as a different label; ignore otherwise. /* if we are adding a route, gateway is a must */ /* Multirouting does not support net routes. */ * Multirouting and user-specified source addresses * do not support interface based routing. * Assigning a source address to an interface based * route is achievable by plumbing a new ipif and * setting up the interface route via this ipif, * The RTF_SETSRC flag is present, check that * the supplied src address is not the loopback * address. This would produce martian packets. * Also check that the supplied address is a * valid, local one. Only allow IFF_UP ones * The RTF_SETSRC modifier must be associated * to a non-null source address. * The RTF_SETSRC flag is present, check that * the supplied src address is not the loopback * address. This would produce martian packets. * Also check that the supplied address is a * valid, local one. Only allow UP ones. * The RTF_SETSRC modifier must be associated * to a non-null source address. /* if we are deleting a route, gateway is a must */ * The RTF_SETSRC modifier does not make sense * In the case of RTM_GET, the forwarding table should be * searched recursively. Also, if a gateway was * specified then the gateway address must also be matched. * In the case of RTM_CHANGE, the gateway address (if supplied) * is the new gateway address so matching on the gateway address * is not done. This can lead to ambiguity when looking up the * route to change as usually only the destination (and netmask, * if supplied) is used for the lookup. However if a RTA_IFP * sockaddr is also supplied, it can disambiguate which route to * change provided the ambigous routes are tied to distinct * ill's (or interface indices). If the routes are not tied to * any particular interfaces (for example, with traditional * gateway routes), then a RTA_IFP sockaddr will be of no use as * it won't match any such routes. * RTA_SRC is not supported for RTM_GET and RTM_CHANGE, * except when RTM_CHANGE is combined to RTF_SETSRC. * Do not want to change the gateway, * but rather the source address. * If the netmask is all ones (either as supplied or as derived * above), then first check for an IRE_LOOPBACK or * If we didn't check for or find an IRE_LOOPBACK or IRE_LOCAL * entry, then look for any other type of IRE. * Want to return failure if we get an IRE_NOROUTE from /* we know the IRE before we come here */ * Do not allow to the multirouting state of a route * to be changed. This aims to prevent undesirable * stages where both multirt and non-multirt routes * for the same destination are declared. * Note that we do not need to do * ire_flush_cache_*(IRE_FLUSH_ADD) as a change * in metrics or gateway will not affect existing * routes since it does not create a more specific * present, check that the * supplied src address is not * the loopback address. This * supplied addr is a valid * Let conn_ixa caching know that * source address selection changed * present, check that the * supplied src address is not * the loopback address. This * supplied addr is a valid * Let conn_ixa caching know that * source address selection changed * Create and add the security attribute to * prefix IRE; it will add a reference to the * group upon allocating a new entry. If it * finds an already-existing entry for the * security attribute, it simply returns it * and no new group reference is made. /* OK ACK already set up by caller except this */ ip2dbg((
"ip_rts_request: OK ACK\n"));
* Helper function that can do recursive lookups including when * MATCH_IRE_GW and/or MATCH_IRE_MASK is set. * ire_route_recursive can't match gateway or mask thus if they are * set we have to do two steps of lookups /* The first ire_gw_secattr is passed back */ /* Look for an interface ire recursively based on the gateway */ * Don't allow anything unusual past the first * iteration. Clearing ifire means caller will not see a * complete response - there will be no RTA_IFP returned. * ire_route_recursive can't match gateway or mask thus if they are * set we have to do two steps of lookups /* The first ire_gw_secattr is passed back */ * Don't allow anything unusual past the first * iteration. Clearing ifire means caller will not see a * complete response - there will be no RTA_IFP returned. * Handle IP_IOC_RTS_REQUEST ioctls * The Routing Socket data starts on * next block. If there is no next block * this is an indication from routing module * that it is a routing socket stream queue. * We need to support that for compatibility with SDP since * it has a contract private interface to use IP_IOC_RTS_REQUEST. * Note: SDP no longer uses IP_IOC_RTS_REQUEST - we can remove this. * This is a message from SDP * indicating that this is a Routing Socket * Stream. Insert this conn_t in routing /* Note that we pass a NULL ira to rts_input */ /* conn was refheld in ip_wput_ioctl. */ * Build a reply to the RTM_GET request contained in the given message block * using the retrieved IRE of the destination address, the parent IRE (if it * exists) and the address family. * Returns a pointer to a message block containing the reply if successful, * otherwise NULL is returned. * Find the ill used to send packets. This will be NULL in case * of a reject or blackhole. * Always return RTA_DST, RTA_GATEWAY and RTA_NETMASK. * RTA_IFP and RTA_IFA if either is defined, and also * returns RTA_BRD if the appropriate interface is * We associate an IRE with an ILL, hence we don't exactly * know what might make sense for RTA_IFA and RTA_BRD. We * pick the first ipif on the ill. * We set the destination address, gateway address, * netmask and flags in the RTM_GET response depending * on whether we found a parent IRE or not. * In particular, if we did find a parent IRE during the * recursive search, use that IRE's gateway address. * Otherwise, we use the IRE's source address for the * The rtm_msglen, rtm_version and rtm_type fields in * RTM_GET response are filled in by rts_fill_msg. * rtm_addrs and rtm_flags are filled in based on what * was requested and the state of the IREs looked up * rtm_inits and rtm_rmx are filled in with metrics * based on whether a parent IRE was found or not. * TODO: rtm_index and rtm_use should probably be * filled in with something resonable here and not just * copied from the request. * Fill the given if_data_t with interface statistics. /* ethernet, tokenring, etc */ /* metric (external only) */ * Set the metrics on a forwarding table route. /* Need to add back some metrics to the IRE? */ * Bypass obtaining the lock and searching ill_saved_ire_mp in the * common case of no metrics. * iulp_rtt and iulp_rtt_sd are in milliseconds, but 4.4BSD-Lite2's * <net/route.h> says: rmx_rtt and rmx_rttvar are stored as * Update the metrics in the IRE itself. * Search through the ifrt_t chain hanging off the ILL in order to * reflect the metric change there. * On a given ill, the tuple of address, gateway, mask, * ire_type and zoneid unique for each saved IRE. * Update any IRE_IF_CLONE hanging created from this IRE_IF so they * We do that by deleting them; ire_create_if_clone will pick * Get the metrics from a forwarding table route. * iulp_rtt and iulp_rtt_sd are in milliseconds, but 4.4BSD-Lite2's * <net/route.h> says: rmx_rtt and rmx_rttvar are stored as * Given two sets of metrics (src and dst), use the dst values if they are * set. If a dst value is not set but the src value is set, then we use * dst is updated with the new values. * This is used to merge information from a dce_t and ire_metrics, where the * dce values takes precedence. * Takes a pointer to a routing message and extracts necessary info by looking * at the rtm->rtm_addrs bits and store the requested sockaddrs in the pointers * passed (all of which must be valid). * The bitmask of sockaddrs actually found in the message is returned, or zero * is returned in the case of an error. * At present we handle only RTA_DST, RTA_GATEWAY, RTA_NETMASK, RTA_IFP, * RTA_IFA and RTA_AUTHOR. The rest will be added as we need them. * The address family we are working with starts out as * AF_UNSPEC, but is set to the one specified with the * If the "working" address family that has been set to * something other than AF_UNSPEC, then the address family of * subsequent sockaddrs must either be AF_UNSPEC (for * compatibility with older programs) or must be the same as our * This code assumes that RTA_DST (1) comes first in the loop. /* Source address of the incoming packet */ * Parse the routing message and look for any security- * related attributes for the route. For each valid * route security attributes. * Fills the message with the given info. * First find the type of the message * Now find the size of the data * that follows the message header. * RTA_BRD is used typically to specify a point-to-point * set the fields that are common to * Allocates and initializes a routing socket message. * Note that sacnt is either zero or one. * Returns the size of the routing * socket message header size. * Returns the size of the message needed with the given rtm_addrs and family. * It is assumed that all of the sockaddrs (with the exception of RTA_IFP) are * of the same family (currently either AF_INET or AF_INET6). * This routine is called to generate a message to the routing * socket indicating that a redirect has occured, a routing lookup * has failed, or that a protocol has detected timeouts to a particular * destination. This routine is called for message types RTM_LOSING, * RTM_REDIRECT, and RTM_MISS. * This routine is called to generate a message to the routing * socket indicating that the status of a network interface has changed. * Message type generated RTM_IFINFO. * This message should be generated only * when the physical device is changing * If this message is for an underlying interface, prevent * "normal" (IPMP-unaware) routing sockets from seeing it. * If cmd is RTM_ADD or RTM_DELETE, generate the rt_msghdr_t message; * otherwise (RTM_NEWADDR, RTM_DELADDR, RTM_CHGADDR and RTM_FREEADDR) * generate the ifa_msghdr_t message. * Do not report unspecified address if this is the RTM_CHGADDR or * This is called to generate messages to the routing socket * indicating a network interface has had addresses associated with it. * The structure of the code is based on the 4.4BSD-Lite2 <net/rtsock.c>. * If this message is for an underlying interface, prevent * "normal" (IPMP-unaware) routing sockets from seeing it. * Let conn_ixa caching know that source address selection * If the request is DELETE, send RTM_DELETE and RTM_DELADDR. * if the request is ADD, send RTM_NEWADDR and RTM_ADD. * otherwise simply send the request. * Based on the address family specified in a sockaddr, copy the address field * In the case of AF_UNSPEC, we assume the family is actually AF_INET for * compatibility with programs that leave the family cleared in the sockaddr. * Callers of rts_copyfromsockaddr should check the family themselves if they * wish to verify its value. * In the case of AF_INET6, a check is made to ensure that address is not an