781N/A * The contents of this file are subject to the terms of the 781N/A * Common Development and Distribution License (the "License"). 781N/A * You may not use this file except in compliance with the License. 781N/A * See the License for the specific language governing permissions 781N/A * and limitations under the License. 781N/A * When distributing Covered Code, include this CDDL HEADER in each 781N/A * If applicable, add the following below this CDDL HEADER, with the 781N/A * fields enclosed by brackets "[]" replaced with your own identifying 781N/A * information: Portions Copyright [yyyy] [name of copyright owner] 781N/A * Copyright (c) 1991, 2010, Oracle and/or its affiliates. All rights reserved. 781N/A * Copyright (c) 1990 Mentat Inc. 781N/A/* Temporary; for CR 6451644 work-around */ 781N/A * These rules should be judiciously applied 781N/A * if there is a need to identify something as IPv6 versus IPv4 921N/A * IPv6 funcions will end with _v6 in the ip module. 1795N/A * IPv6 funcions will end with _ipv6 in the transport modules. 921N/A * Some macros end with _V6; e.g. ILL_FRAG_HASH_V6 1795N/A * Some macros start with V6_; e.g. V6_OR_V4_INADDR_ANY 921N/A * And then there are ..V4_PART_OF_V6. 921N/A * The intent is that macros in the ip module end with _V6. 1795N/A * IPv6 global variables will start with ipv6_ 781N/A * IPv6 structures will start with ipv6 781N/A * IPv6 defined constants should start with IPV6_ 781N/A * (but then there are NDP_DEFAULT_VERS_PRI_AND_FLOW, etc) 781N/A * We need to do this because we didn't obtain the IP6OPT_LS (0x0a) 781N/A * from IANA. This mechanism will remain in effect until an official 781N/A {
0xffffffffU,
0xffffffffU,
0xffffffffU,
0xffffffffU };
781N/A#
endif /* _BIG_ENDIAN */ 781N/A#
endif /* _BIG_ENDIAN */ 781N/A#
endif /* _BIG_ENDIAN */ 781N/A#
endif /* _BIG_ENDIAN */ 781N/A#
endif /* _BIG_ENDIAN */ 781N/A {
0xff020000U, 0,
0x00000001U,
0xff000000U };
781N/A {
0x000002ffU, 0,
0x01000000U,
0x000000ffU };
781N/A#
endif /* _BIG_ENDIAN */ 781N/A * icmp_inbound_v6 deals with ICMP messages that are handled by IP. 781N/A * If the ICMP message is consumed by IP, i.e., it should not be delivered 781N/A * to any IPPROTO_ICMP raw sockets, then it returns NULL. 781N/A * Likewise, if the ICMP error is misformed (too short, etc), then it 781N/A * returns NULL. The caller uses this to determine whether or not to send 781N/A * All error messages are passed to the matching transport stream. 781N/A * See comment for icmp_inbound_v4() on how IPsec is handled. 781N/A /* Check for Martian packets */ 781N/A /* Make sure ira_l2src is set for ndp_input */ 781N/A * We will set "interested" to "true" if we should pass a copy to 781N/A * the transport i.e., if it is an error message. 781N/A * We must have exclusive use of the mblk to convert it to 781N/A /* We now allow a RAW socket to receive this. */ 781N/A * The next three icmp messages will be handled by MLD. 781N/A * Pass all valid MLD packets up to any process(es) 781N/A * listening on a raw ICMP socket. 781N/A /* If there is an ICMP client and we want one too, copy it. */ 781N/A /* Caller will deliver to RAW sockets */ 3758N/A /* Neither we nor raw sockets are interested. Drop packet now */ 781N/A * ICMP error or redirect packet. Make sure we have enough of 781N/A * the header and that db_ref == 1 since we might end up modifying 781N/A * In case mp has changed, verify the message before any further 781N/A /* Update DCE and adjust MTU is icmp header if needed */ 781N/A * Send an ICMP echo reply. 781N/A * The caller has already updated the payload part of the packet. 781N/A * We handle the ICMP checksum, IP source address selection and feed 781N/A * the packet into ip_output_simple. 781N/A * Remove any extension headers (do not reverse a source route) 781N/A * and clear the flow id (keep traffic class for now). 781N/A /* Reverse the source and destination addresses. */ 781N/A /* set the hop limit */ 781N/A * Prepare for checksum by putting icmp length in the icmp 781N/A * checksum field. The checksum is calculated in ip_output 781N/A * This packet should go out the same way as it 781N/A * came in i.e in clear, independent of the IPsec 781N/A * policy for transmitting packets. 781N/A /* Note: mp already consumed and ip_drop_packet done */ 781N/A /* Was the destination (now source) link-local? Send out same group */ 781N/A * Not one or our addresses (IRE_LOCALs), thus we let 781N/A * ip_output_simple pick the source. 1795N/A /* Should we send using dce_pmtu? */ 781N/A * Verify the ICMP messages for either for ICMP error or redirect packet. 781N/A * The caller should have fully pulled up the message. If it's a redirect 781N/A * packet, only basic checks on IP header will be done; otherwise, verify 781N/A * the packet by looking at the included ULP header. 1795N/A * Called before icmp_inbound_error_fanout_v6 is called. 3758N/A * Stop here for ICMP_REDIRECT. 3758N/A /* Try to pass the ICMP message to clients who need it */ 3758N/A * Verify we have at least ICMP_MIN_TP_HDR_LEN bytes of 3758N/A * Verify we have at least ICMP_MIN_TP_HDR_LEN bytes of 3758N/A * With IPMP we need to match across group, which we do 3758N/A * since we have the upper ill from ira_ill. 3758N/A * Verify we have at least ICMP_MIN_TP_HDR_LEN bytes of 3758N/A /* Look for self-encapsulated packets that caused an error */ 3758N/A /* We pulled up everthing already. Must be truncated */ 3758N/A * Process received IPv6 ICMP Packet too big. 3758N/A * The caller is responsible for validating the packet before passing it in 3758N/A * and also to fanout the ICMP error to any matching transport conns. Assumes 3758N/A * the message has been fully pulled up. 3758N/A * Before getting here, the caller has called icmp_inbound_verify_v6() 3758N/A * that should have verified with ULP to prevent undoing the changes we're 3758N/A * going to make to DCE. For example, TCP might have verified that the packet 3758N/A * which generated error is in the send window. 3758N/A * In some cases modified this MTU in the ICMP header packet; the caller 3758N/A * should pass to the matching ULP after this returns. 3758N/A /* Caller has already pulled up everything. */ 3758N/A * For link local destinations matching simply on address is not 3758N/A * sufficient. Same link local addresses for different ILL's is 3758N/A /* Couldn't add a unique one - ENOMEM */ ip1dbg((
"Received mtu less than IPv6 " * If an mtu less than IPv6 min mtu is received, * we must include a fragment header in ip1dbg((
"Received mtu from router: %d\n",
mtu));
/* Prepare to send the new max frag size for the ULP. */ * If we need a fragment header in every packet * (above case or multirouting), make sure the * ULP takes it into account when computing the /* We now have a PMTU for sure */ * After dropping the lock the new value is visible to everyone. * Then we bump the generation number so any cached values reinspect * Fanout received ICMPv6 error packets to the transports. * Assumes the IPv6 plus ICMPv6 headers have been pulled up but nothing else. * The caller must have called icmp_inbound_verify_v6. uint16_t *
up;
/* Pointer to ports in ULP header */ /* Caller has already pulled up everything. */ * We need a separate IP header with the source and destination * the ICMPv6 error is in the form we sent it out. /* Try to pass the ICMP message to clients who need it */ /* Attempt to find a client stream based on port. */ /* Note that we send error to all matches. */ * Attempt to find a client stream based on port. * Note that we do a reverse lookup since the header is * in the form we sent it out. * With IPMP we need to match across group, which we do * since we have the upper ill from ira_ill. /* Note that mp is NULL */ /* Not TCP; must be SOCK_RAW, IPPROTO_TCP */ /* Find a SCTP client stream for this packet. */ /* Just in case ipsec didn't preserve the NULL b_cont */ * If succesful, the mp has been modified to not include * the ESP/AH header so we can fanout to the ULP's icmp /* Don't call hdr_length_v6() unless you have to. */ /* Verify the modified message before any further processes. */ /* Look for self-encapsulated packets that caused an error */ * Self-encapsulated case. As in the ipv4 case, * we need to strip the 2nd IP header. Since mp * is already pulled-up, we can simply bcopy * the 3rd header + data over the 2nd header. * Make sure we don't do recursion more than once. * Copy the 3rd header + remaining data on top * Subtract length of the 2nd header. /* Don't call hdr_length_v6() unless you have to. */ * Verify the modified message before any further * Now recurse, and see what I _really_ should be * No IP tunnel is interested, fallthrough and see * if a raw socket will want it. ip1dbg((
"icmp_inbound_error_fanout_v6: drop pkt\n"));
* Process received IPv6 ICMP Redirect messages. * Assumes the caller has verified that the headers are in the pulled up mblk. * Since ira_ill is where the IRE_LOCAL was hosted we use ira_rill * and make it be the IPMP upper so avoid being confused by a packet * addressed to a unicast address on a different ill. /* Verify if it is a valid redirect */ * Verify that the IP source address of the redirect is * the same as the current first-hop router for the specified * ICMP destination address. * Also, Make sure we had a route for the dest in question and * that route was pointing to the old gateway (the source of the * We do longest match and then compare ire_gateway_addr_v6 below. * the redirect was not from ourselves * old gateway is still directly reachable * Check to see if link layer address has changed and * process the ncec_state accordingly. ip1dbg((
"icmp_redirect_v6: NCE create failed %d\n",
* Create a Route Association. This will allow us to remember * a router told us to use the particular gateway. * Just create an on link entry, i.e. interface route. * The gateway field is our link-local on the ill. /* We have no link-local address! */ dst,
/* gateway == dst */ /* Check if it was a duplicate entry */ /* tell routing sockets that we received a redirect */ * Delete any existing IRE_HOST type ires for this destination. * This together with the added IRE has the effect of * modifying an existing redirect. * Build and ship an IPv6 ICMP message using the packet data in mp, * and the ICMP header pointed to by "stuff". (May be called as * Note: assumes that icmp_pkt_err_ok_v6 has been called to * verify that an icmp error packet can be sent. * If v6src_ptr is set use it as a source. Otherwise select a reasonable * source address (see above function). * If the source of the original packet was link-local, then * make sure we send on the same ill (group) as we received it on. * Apply IPsec based on how IPsec was applied to * the packet that had the error. * If it was an outbound packet that caused the ICMP * error, then the caller will have setup the IRA /* Note: mp already consumed and ip_drop_packet done */ * This is in clear. The icmp message we are building * here should go out in clear, independent of our policy. * If the caller specified the source we use that. * Otherwise, if the packet was for one of our unicast addresses, make * sure we respond with that as the source. Otherwise * have ip_output_simple pick the source address. * Set IXAF_TRUSTED_ICMP so we can let the ICMP messages this * node generates be accepted in peace by all on-host destinations. * If we do NOT assume that all on-host destinations trust * self-generated ICMP messages, then rework here, ip6.c, and spd.c. * (Look for IXAF_TRUSTED_ICMP). * Prepare for checksum by putting icmp length in the icmp * checksum field. The checksum is calculated in ip_output_wire_v6. * Update the output mib when ICMPv6 packets are sent. * Check if it is ok to send an ICMPv6 error packet in * response to the IP packet in mp. * Free the message and return null if no * ICMP error packet should be sent. /* We view multicast and broadcast as the same.. */ /* Check if source address uniquely identifies the host */ /* Explicitly do not generate errors in response to redirects */ * Check that the destination is not multicast and that the packet * was not sent on link layer broadcast or multicast. (Exception * is Packet too big message as per the draft - when mcast_ok is set.) * If this is a labeled system, then check to see if we're allowed to * send a response to this particular sender. If not, then just drop. * Only send ICMP error packets every so often. * but for now this will suffice. * Called when a packet was sent out the same link that it arrived on. * Check if it is ok to send a redirect and then send it. * Don't send a redirect when forwarding a source /* Target is directly connected */ /* Determine the most specific IRE used to send the packets */ * We won't send redirects to a router * that doesn't have a link local * address, but will forward. * The source is directly connected. * Generate an ICMPv6 redirect message. * Include target link layer address option if it exits. * Always include redirect header. /* max_redir_hdr_data_len and nd_opt_rh_len must be multiple of 8 */ /* Make sure mp is 8 byte aligned */ /* ipif_v6lcl_addr contains the link-local source address */ /* Redirects sent by router, and router is global zone */ /* Generate an ICMP time exceeded message. (May be called as writer.) */ * Generate an ICMP unreachable message. * When called from ip_output side a minimal ip_recv_attr_t needs to be * constructed by the caller. * Generate an ICMP pkt too big message. * When called from ip_output side a minimal ip_recv_attr_t needs to be * constructed by the caller. * Generate an ICMP parameter problem message. (May be called as writer.) * 'offset' is the offset from the beginning of the packet in error. * When called from ip_output side a minimal ip_recv_attr_t needs to be * constructed by the caller. /* Determine the offset of the bad nexthdr value */ * Verify whether or not the IP address is a valid local address. * Could be a unicast, including one for a down interface. * If allow_mcbc then a multicast or broadcast address is also * In the case of a multicast address, however, the * upper protocol is expected to reset the src address * to zero when we return IPVL_MCAST so that * no packets are emitted with multicast address as * The addresses valid for bind are: * (2) - IP address of an UP interface * (3) - IP address of a DOWN interface * (4) - a multicast address. In this case * the conn will only receive packets destined to * the specified multicast address. Note: the * application still has to issue an * IPV6_JOIN_GROUP socket option. * In all the above cases, the bound address must be valid in the current zone. * When the address is loopback or multicast, there might be many matching IREs * so bind has to look up based on the zone. * If an address other than in6addr_any is requested, * we verify that it is a valid address for bind * Note: Following code is in if-else-if form for * readability compared to a condition check. * (2) Bind to address of local UP interface /* (4) bind to multicast address. */ * Note: caller should take IPV6_MULTICAST_IF * into account when selecting a real source address. * (3) Bind to address of local DOWN interface? * (ipif_lookup_addr() looks up all interfaces * but we do not get here for UP interfaces /* Not a useful source? */ * Verify that both the source and destination addresses are valid. If * IPDF_VERIFY_DST is not set, then the destination address may be unreachable, * i.e. have no route to it. Protocols like TCP want to verify destination * reachability, while tunnels do not. * Determine the route, the interface, and (optionally) the source address * to use to reach a given destination. * Note that we allow connect to broadcast and multicast addresses when * IPDF_ALLOW_MCBC is set. * first_hop and dst_addr are normally the same, but if source routing * they will differ; in that case the first_hop is what we'll use for the * routing lookup but the dce and label checks will be done on dst_addr, * If uinfo is set, then we fill in the best available information * we have for the destination. This is based on (in priority order) any * metrics and path MTU stored in a dce_t, route metrics, and finally the * Tsol note: If we have a source route then dst_addr != firsthop. But we * always do the label check on dst_addr. * Assumes that the caller has set ixa_scopeid for link-local communication. * We never send to zero; the ULPs map it to the loopback address. * We can't allow it since we use zero to mean unitialized in some * Select a route; For IPMP interfaces, we would only select * a "hidden" route (i.e., going through a specific under_ill) * if ixa_ifindex has been specified. * ire can't be a broadcast or multicast unless IPDF_ALLOW_MCBC is set. * If IPDF_VERIFY_DST is set, the destination must be reachable. * Otherwise the destination needn't be reachable. * If we match on a reject or black hole, then we've got a * local failure. May as well fail out the connect() attempt, * since it's never going to succeed. * If we're verifying destination reachability, we always want * If we're not verifying destination reachability but the * destination has a route, we still want to fail on the * temporary address and broadcast address tests. * In both cases do we let the code continue so some reasonable * information is returned to the caller. That enables the * caller to use (and even cache) the IRE. conn_ip_ouput will * use the generation mismatch path to check for the unreachable * case thereby avoiding any specific check in the main path. * Set errno but continue to set up ixa_ire to be * the RTF_REJECT|RTF_BLACKHOLE IRE. * That allows callers to use ip_output to get an * Ensure that ixa_dce is always set any time that ixa_ire is set, * since some callers will send a packet to conn_ip_output() even if /* If we are creating a DCE we'd better have an ifindex */ /* Fallback to the default dce if allocation fails */ * For multicast with multirt we have a flag passed back from * ire_lookup_multi_ill_v6 since we don't have an IRE for each * possible multicast address. * We also need a flag for multicast since we can't check * whether RTF_MULTIRT is set in ixa_ire for multicast. /* Get an nce to cache. */ /* Allocation failure? */ * If the source address is a loopback address, the * destination had best be local or multicast. * If we are sending to an IRE_LOCAL using a loopback source then * it had better be the same zoneid. ire =
NULL;
/* Stored in ixa_ire */ ire =
NULL;
/* Stored in ixa_ire */ * Does the caller want us to pick a source address? * We use use ire_nexthop_ill to avoid the under ipmp * interface for source address selection. Note that for ipmp * probe packets, ixa_ifindex would have been specified, and * the ip_select_route() invocation would have picked an ire * will ire_ill pointing at an under interface. /* If unreachable we have no ill but need some source */ /* Make sure we look for a better source address */ ire =
NULL;
/* Stored in ixa_ire */ * We allow the source address to to down. * However, we check that we don't use the loopback address * as a source when sending out on the wire. ire =
NULL;
/* Stored in ixa_ire */ * Make sure we don't leave an unreachable ixa_nce in place * since ip_select_route is used when we unplumb i.e., remove * references on ixa_ire, ixa_nce, and ixa_dce. * Note that IPv6 multicast supports PMTU discovery unlike IPv4 * multicast. But pmtu discovery is only enabled for connected * Set initial value for fragmentation limit. Either conn_ip_output * or ULP might updates it when there are routing changes. * Handles a NULL ixa_ire->ire_ill or a NULL ixa_nce for RTF_REJECT. /* Make sure ixa_fragsize and ixa_pmtu remain identical */ * Extract information useful for some transports. * First we look for DCE metrics. Then we take what we have in * the metrics in the route, where the offlink is used if we have /* Allow ire_metrics to decrease the path MTU from above */ * Make sure we don't leave an unreachable ixa_nce in place * since ip_select_route is used when we unplumb i.e., remove * references on ixa_ire, ixa_nce, and ixa_dce. * Handle protocols with which IP is less intimate. There * can be more than one stream bound to a particular * protocol. When this is the case, normally each one gets a copy * of any incoming packets. * Packets will be distributed to conns in all zones. This is really only * useful for ICMPv6 as only applications in the global zone can create raw * sockets for other protocols. /* Note: IPCL_PROTO_MATCH_V6 includes conn_wantpacket */ * No one bound to this port. Is * there a client that wants all * XXX: Fix the multiple protocol listeners case. We should not * be walking the conn->conn_next list here. /* Note: IPCL_PROTO_MATCH_V6 includes conn_wantpacket */ /* No more interested clients */ /* Memory allocation failed */ /* Follow the next pointer before releasing the conn. */ /* Last one. Send it upstream. */ * Called when it is conceptually a ULP that would sent the packet * e.g., port unreachable and nexthdr unknown. Check that the packet * would have passed the IPsec global policy before sending the error. * Send an ICMP error after patching up the packet appropriately. * Uses ip_drop_input and bumps the appropriate MIB. * For ICMP6_PARAMPROB_NEXTHEADER we determine the offset to use. * We are generating an icmp error for some inbound packet. * Called from all ip_fanout_(udp, tcp, proto) functions. * Before we generate an error, check with global policy * to see whether this is allowed to enter the system. As * there is no "conn", we are checking with global policy. /* We never send errors for protocols that we do implement */ /* Let the system determine the offset for this one */ panic(
"ip_fanout_send_icmp_v6: wrong type");
* Fanout for UDP packets that are multicast or ICMP errors. * (Unicast fanout is handled in ip_input_v6.) * If SO_REUSEADDR is set all multicast packets * will be delivered to all conns bound to the same port. * Fanout for UDP packets. * The caller puts <fport, lport> in the ports parameter. * ire_type must be IRE_BROADCAST for multicast and broadcast packets. * If SO_REUSEADDR is set all multicast and broadcast packets * will be delivered to all conns bound to the same port. * Earlier in ip_input on a system with multiple shared-IP zones we * duplicate the multicast and broadcast packets and send them up * with each explicit zoneid that exists on that ill. * This means that here we can match the zoneid with SO_ALLZONES being special. /* Attempt to find a client stream based on destination port. */ /* No more interested clients */ /* Memory allocation failed */ /* Follow the next pointer before releasing the conn. */ /* Last one. Send it upstream. */ * No one bound to this port. Is * there a client that wants all * This routine is used by the upper layer protocols, iptun, and IPsec: * - Set extension header pointers to appropriate locations * - Determine IPv6 header length and return it * - Return a pointer to the last nexthdr value * The caller must initialize ipp_fields. * The upper layer protocols normally set label_separate which makes the * routine put the TX label in ipp_label_v6. If this is not set then * the hop-by-hop options including the label are placed in ipp_hopopts. * NOTE: If multiple extension headers of the same type are present, * ip_find_hdr_v6() will set the respective extension header pointers * to the first one that it encounters in the IPv6 header. It also * skips fragment headers. This routine deals with malformed packets * of various sorts in which case the returned length is up to the /* Is there enough left for len + nexthdr? */ /* We check for any CIPSO */ * We have dropped packets with bad options in * ip6_input. No need to check return value /* return only 1st hbh */ * ipp_dstopts is set to the destination header after a * Assume it is a post-rthdr destination header * and adjust when we find an rthdr. /* return only 1st rthdr */ * Make any destination header we've seen be a * pre-rthdr destination header. * Try to determine where and what are the IPv6 header length and * pointer to nexthdr value for the upper layer protocol (or an * Parameters returns a pointer to the nexthdr value; * Must handle malformed packets of various sorts. * Function returns failure for malformed cases. /* Is there enough left for len + nexthdr? */ /* Assumes the headers are identical for hbh and dst */ /* No next header means we're finished */ * If any know extension headers are still to be processed, * the packet's malformed (or at least all the IP header(s) are * not in the same mblk - and that should never happen. * If we get here, we know that all of the IP headers were in * the same mblk, even if the ULP header is in the next mblk. * Return the length of the IPv6 related headers (including extension headers) * Returns a length even if the packet is malformed. * Parse and process any hop-by-hop or destination options. * Assumes that q is an ill read queue so that ICMP errors for link-local * destinations are sent out the correct interface. * Returns -1 if there was an error and mp has been consumed. * Returns 0 if no special action is needed. * Returns 1 if the packet contained a router alert option for this node * XXX Note: In future as more hbh or dest options are defined, * it may be better to have different routines for hbh and dest * options as opt_type fields other than IP6OPT_PAD1 and IP6OPT_PADN * may have same value in different namespaces. Or is it same namespace ?? * Current code checks for each opt_type (other than pads) if it is in * the expected nexthdr (hbh or dest) * Note:We don't verify that (N-2) pad octets * are zero as required by spec. Adhere to * "be liberal in what you accept..." part of * implementation philosophy (RFC791,RFC1122) /* Check total length and alignment */ * Minimal support for the home address option * (which is required by all IPv6 nodes). * Implement by just swapping the home address * XXX Note: this has IPsec implications since * AH needs to take this into account. * Also, when IPsec is used we need to ensure * that this is only processed once * in the received packet (to avoid swapping * NOTE:This option processing is considered * to be unsafe and prone to a denial of * The current processing is not safe even with * IPsec secured IP packets. Since the home * address option processing requirement still * is in the IETF draft and in the process of * being redefined for its usage, it has been * decided to turn off the option by default. * If this section of code needs to be executed, * ndd variable ip6_ignore_home_address_opt * should be set to 0 at the user's own risk. * We did this dest. opt the first time * around (i.e. before AH processing). * If we've done AH... stop now. /* Check total length and alignment */ /* Swap ip6_src and the home address */ /* XXX Note: only 8 byte alignment option */ /* Determine which zone should send error */ ip1dbg((
"ip_process_options_v6: %s " ip1dbg((
"ip_process_options_v6: %s " "opt 0x%x; packet dropped\n",
/* Determine which zone should send error */ * Process a routing header that is not yet empty. * Because of RFC 5095, we now reject all route headers. /* XXX Check for source routed out same interface? */ * Read side put procedure for IPv6 module. * Things are opening or closing - only accept DLPI * ack messages. If the stream is closing and ip_wsrv * has completed, ip_close is out of the qwait, but has * not yet completed qprocsoff. Don't proceed any further * because the ill has been cleaned up and things hanging * off the ill have been freed. * Walk through the IPv6 packet in mp and see if there's an AH header * in it. See if the AH header needs to get done before other headers in * the packet. (Worker function for ipsec_early_ah_v6().) * For now just pullup everything. In general, the less pullups, * the better, but there's so much squirrelling through anyway, * it's just easier this way. * We can't just use the argument nexthdr in the place * of nexthdrp becaue we don't dereference nexthdrp * till we confirm whether it is a valid address. /* Is there enough left for len + nexthdr? */ /* Assumes the headers are identical for hbh and dst */ * Return DONT_PROCESS because the destination * options header may be for each hop in a * routing-header, and we only want AH if we're * finished with routing headers. * If there's more hops left on the routing header, * return now with DON'T PROCESS. /* Wait for reassembly */ /* No next header means we're finished */ * Path for AH if options are present. * Returns NULL if the mblk was consumed. * Sometimes AH needs to be done before other IPv6 headers for security * reasons. This function (and its ipsec_needs_processing_v6() above) * indicates if that is so, and fans out to the appropriate IPsec protocol * for the datagram passed in. /* Default means send it to AH! */ * Either it failed or is pending. In the former case * ipIfStatsInDiscards was increased. /* we're done with IPsec processing, send it up */ * When it returns a completed message the first mblk will only contain * the headers prior to the fragment header, with the nexthdr value updated * to be the header after the fragment header. * We utilize hardware computed checksum info only for UDP since * IP fragmentation is a normal occurence for the protocol. In * addition, checksum offload support for IP fragments carrying * UDP payload is commonly implemented across network adapters. /* Record checksum information from the packet */ /* fragmented payload offset from beginning of mblk */ * Partial checksum has been calculated by hardware * and attached to the packet; in addition, any * prepended extraneous data is even byte aligned. * If any such data exists, we adjust the checksum; * this would also handle any postpended data. /* One's complement subtract extraneous checksum */ /* Clear hardware checksumming flag */ * Determine the offset (from the begining of the IP header) * of the nexthdr value which has IPPROTO_FRAGMENT. We use * this when removing the fragment header from the packet. * This packet consists of the IPv6 header, a potential * hop-by-hop options header, a potential pre-routing-header * destination options header, and a potential routing header. /* Can't handle other headers before the fragment header */ * Note: Fragment offset in header is in 8-octet units. * Clearing least significant 3 bits not only extracts * it but also gets it in units of octets. * Is the more frags flag on and the payload length not a multiple * Would fragment cause reassembled packet to have a payload length * greater than IP_MAXPACKET - the max payload size? * This packet just has one fragment. Reassembly not * Drop the fragmented as early as possible, if * we don't have resource(s) to re-assemble. /* Record the ECN field info. */ * If this is not the first fragment, dump the unfragmentable * Fragmentation reassembly. Each ILL has a hash table for * queueing packets undergoing reassembly for all IPIFs * associated with the ILL. The hash is based on the packet * IP ident field. The ILL frag hash table was allocated * as a timer block at the time the ILL was created. Whenever * there is anything on the reassembly queue, the timer will /* Handle vnic loopback of fragments */ * If the reassembly list for this ILL will get too big /* Try to find an existing fragment queue for this packet. */ * It has to match on ident, source address, and * If we have received too many * duplicate fragments for this packet * If we pruned the list, do we want to store this new * fragment?. We apply an optimization here based on the * fact that most fragments will be received in order. * So if the offset of this incoming fragment is zero, * it is the first fragment of a new packet. We will * keep it. Otherwise drop the fragment, as we have * probably pruned the packet already (since the * packet cannot be found). /* New guy. Allocate a frag message. */ * Too many fragmented packets in this hash bucket. /* Initialize the fragment header. */ /* Record reassembly start time. */ /* Record ipf generation and account for frag header */ /* Store checksum value in fragment header */ * We handle reassembly two ways. In the easy case, * where all the fragments show up in order, we do * minimal bookkeeping, and just clip new pieces on * the end. If we ever see a hole, then we go off * to ip_reassemble which has to mark the pieces and * keep track of the number of holes, etc. Obviously, * the point of having both mechanisms is so we can * handle the easy case as efficiently as possible. /* Easy case, in-order reassembly so far. */ /* Update the byte count */ * Keep track of next expected offset in /* Hard case, hole at the beginning. */ * ipf_end == 0 means that we have given up /* Forget checksum offload from now on */ * ipf_hole_cnt is set by ip_reassemble. * ipf_count is updated by ip_reassemble. * No need to check for return value here * as we don't expect reassembly to complete or * fail for the first fragment itself. /* Update per ipfb and ill byte counts */ /* If the frag timer wasn't already going, start it. */ * If the packet's flag has changed (it could be coming up * from an interface different than the previous, therefore * possibly different checksum capability), then forget about * any stored checksum states. Otherwise add the value to * the existing one stored in the fragment header. /* Forget checksum offload from now on */ * We have a new piece of a datagram which is already being * reassembled. Update the ECN info if all IP fragments * are ECN capable. If there is one which is not, clear * all the info. If there is at least one which has CE * code point, IP needs to report that up to transport. /* The new fragment fits at the end */ /* Update the byte count */ /* Update per ipfb and ill byte counts */ /* Save current byte count */ /* Count of bytes added and subtracted (freeb()ed) */ /* Update per ipfb and ill byte counts */ /* Reassembly failed. Free up all resources */ /* We will reach here iff 'ret' is IP_REASS_COMPLETE */ * We have completed reassembly. Unhook the frag header from * Grab the unfragmentable header length next header value out * Before we free the frag header, record the ECN info * to report back to the transport. * Store the nextheader field in the header preceding the fragment /* We need to supply these to caller */ /* Ditch the frag header. */ * Make sure the packet is good by doing some sanity * check. If bad we can silentely drop the packet. ip1dbg((
"ip_input_fragment_v6: bad packet\n"));
* Remove the fragment header from the initial header by * splitting the mblk into the non-fragmentable header and * everthing after the fragment extension header. This has the * side effect of putting all the headers that need destination * processing into the b_cont block-- on return this fact is * used in order to avoid having to look at the extensions * Note that this code assumes that the unfragmentable portion * of the header is in the first mblk and increments * the read pointer past it. If this assumption is broken ip1dbg((
"ip_input_fragment_v6: dupb failed\n"));
/* Restore original IP length in header. */ /* Record the ECN info. */ /* Update the receive attributes */ /* Reassembly is successful; set checksum information in packet */ * Given an mblk and a ptr, find the destination address in an IPv6 routing * Corrupt packet. Either the routing header length is odd * (can't happen) or mismatched compared to the packet, or the * number of addresses is. Return what we can. This will * only be a problem on forwarded packets that get squeezed * through an outbound tunnel enforcing IPsec Tunnel Mode. * Walk through the options to see if there is a routing header. * If present get the destination which is the last address of * mp needs to be provided in cases when the extension headers might span * b_cont; mp is never modified by this function. /* We assume at least the IPv6 base header is within one mblk. */ * We also assume (thanks to ipsec_tun_outbound()'s pullup) that * no extension headers will be split across mblks. * All IPv6 extension headers have the next-header in byte * 0, and the (length - 8) in 8-byte-words. /* Bad packet. Return what we can. */ * This function is called by redirect code (called from ip_input_v6) to * know whether this packet is source routed through this node i.e * whether this node (router) is part of the journey. This * function is called under two cases : * case 1 : Routing header was processed by this node and * ip_process_rthdr replaced ip6_dst with the next hop * and we are forwarding the packet to the next hop. * case 2 : Routing header was not processed by this node and we * are just forwarding the packet. * For case (1) we don't want to send redirects. For case(2) we * want to send redirects. ip2dbg((
"ip_source_routed_v6\n"));
/* if a routing hdr is preceeded by HOPOPT or DSTOPT */ * Check if we have already processed * packets or we are just a forwarding * router which only pulled up msgs up * to IPV6HDR and one HBH ext header ip2dbg((
"ip_source_routed_v6: Extension" " headers not processed\n"));
* If for some reason, we haven't pulled up * the routing hdr data mblk, then we must * not have processed it at all. So for sure * we are not part of the source routed journey. ip2dbg((
"ip_source_routed_v6: Routing" " header not processed\n"));
* Either we are an intermediate router or the * last hop before destination and we have * already processed the routing header. * If segment_left is greater than or equal to zero, * then we must be the (numaddr - segleft) entry * of the routing header. Although ip6r0_segleft * is a unit8_t variable, we still check for zero * or greater value, if in case the data type * is changed someday in future. ip1dbg((
"ip_source_routed_v6: Not local\n"));
ip2dbg((
"ip_source_routed_v6: Not source routed here\n"));
* IPv6 fragmentation. Essentially the same as IPv4 fragmentation. * We have not optimized this in terms of number of mblks * allocated. For instance, for each fragment sent we always allocate a * mblk to hold the IPv6 header and fragment header. * Assumes that all the extension headers are contained in the first mblk * and that the fragment header has has already been added by calling * Caller should have added fraghdr_t to pkt_len, and also * Determine the length of the unfragmentable portion of this * datagram. This consists of the IPv6 header, a potential * hop-by-hop options header, a potential pre-routing-header * destination options header, and a potential routing header. * Allocate an mblk with enough room for the link-layer * header and the unfragmentable part of the datagram, which includes * the fragment header. This (or a copy) will be used as the * first mblk for each fragment we send. * pkt_len is set to the total length of the fragmentable data in this * datagram. For each fragment sent, we will decrement pkt_len * by the amount of fragmentable data sent in that fragment * until len reaches zero. * Move read ptr past unfragmentable portion, we don't want this part * of the data in our fragments. ip1dbg((
"ip_fragment_v6: copyb failed\n"));
* Note: Optimization alert. * In IPv6 (and IPv4) protocol header, Fragment Offset * ("offset") is 13 bits wide and in 8-octet units. * In IPv6 protocol header (unlike IPv4) in a 16 bit field, * it occupies the most significant 13 bits. * (least significant 13 bits in IPv4). * We do not do any shifts here. Not shifting is same effect * as taking offset value in octet units, dividing by 8 and * then shifting 3 bits left to line it up in place in proper /* mp has already been freed by ip_carve_mp() */ ip1dbg((
"ip_carve_mp: failed\n"));
/* Get the priority marking, if any */ /* No point in sending the other fragments */ /* No need to redo state machine in loop */ * Add a fragment header to an IPv6 packet. * Assumes that all the extension headers are contained in the first mblk. * The fragment header is inserted after an hop-by-hop options header * and after [an optional destinations header followed by] a routing header. * Determine the length of the unfragmentable portion of this * datagram. This consists of the IPv6 header, a potential * hop-by-hop options header, a potential pre-routing-header * destination options header, and a potential routing header. * Allocate an mblk with enough room for the link-layer * header, the unfragmentable part of the datagram, and the /* Get the priority marking, if any */ * Move read ptr past unfragmentable portion, we don't want this part * of the data in our fragments. * Determine if the ill and multicast aspects of that packets * conn_incoming_ifindex is set by IPV6_BOUND_IF and as link-local * scopeid. This is used to limit * unicast and multicast reception to conn_incoming_ifindex. * conn_wantpacket_v6 is called both for unicast and /* mpathd can bind to the under IPMP interface, which we allow */ * pr_addr_dbg function provides the needed buffer space to call * inet_ntop() function's 3rd argument. This function should be * used by any kernel routine which wants to save INET6_ADDRSTRLEN * stack buffer space in it's own stack frame. This function uses * a buffer from it's own stack and prints the information. * Example: pr_addr_dbg("func: no route for %s\n ", AF_INET, addr) * Note: This function can call inet_ntop() once. ip0dbg((
"pr_addr_dbg: Wrong arguments\n"));
* This does not compare debug level and just prints * out. Thus it is the responsibility of the caller * to check the appropriate debug-level before calling * Return the length in bytes of the IPv6 headers (base header * extension headers) that will be needed based on the * ip_pkt_t structure passed by the caller. * The returned length does not include the length of the upper level * If there's a security label here, then we ignore any hop-by-hop * options the user may try to set. * Note that ipp_label_len_v6 is just the option - not * the hopopts extension header. It also needs to be padded * to a multiple of 8 bytes. * En-route destination options * Only do them if there's a routing header as well * All-purpose routine to build a header chain of an IPv6 header * followed by any required extension headers and a proto header. * The caller has to set the source and destination address as well as * ip6_plen. The caller has to massage any routing header and compensate * for the ULP pseudo-header checksum due to the source route. * The extension headers will all be fully filled in. /* Initialize IPv6 header */ /* Overrides the class part of flowinfo */ * Here's where we have to start stringing together * any extension headers in the right order: * Hop-by-hop, destination, routing, and final destination opts. * If there's a security label here, then we ignore any hop-by-hop * options the user may try to set. * Hop-by-hop options with the label. * Note that ipp_label_v6 is just the option - not * the hopopts extension header. It also needs to be padded * to a multiple of 8 bytes. * En-route destination options * Only do them if there's a routing header as well * Do ultimate destination options * Now set the last header pointer to the proto passed in * Return a pointer to the routing header extension header * in the IPv6 header(s) chain passed in. * If none found, return NULL * Assumes that all extension headers are in same mblk as the v6 header * The routing header will precede all extension headers * other than the hop-by-hop and destination options * extension headers, so if we see anything other than those, * we're done and didn't find it. * We could see a destination options header alone but no * routing header, in which case we'll return NULL as soon as * we see anything after that. * Hop-by-hop and destination option headers are identical, * so we can use either one we want as a template. /* Is there enough left for len + nexthdr? */ /* Assumes the headers are identical for hbh and dst */ * Called for source-routed packets originating on this node. * Manipulates the original routing header by moving every entry up * one slot, placing the first entry in the v6 header's v6_dst field, * and placing the ultimate destination in the routing header's last * Returns the checksum diference between the ultimate destination * (last hop in the routing header when the packet is sent) and * the first hop (ip6_dst when the packet is sent) * Perform any processing needed for source routing. * We know that all extension headers will be in the same mblk * If no segments left in header, or the header length field is zero, * don't move hop addresses around; * Checksum difference is zero. * Here's where the fun begins - we have to * move all addresses up one spot, take the * first hop and make it our first ip6_dst, * and place the ultimate destination in the * newly-opened last slot. * From the checksummed ultimate destination subtract the checksummed * current ip6_dst (the first hop address). Return that number. * (In the v4 case, the second part of this is done in each routine * that calls ip_massage_options(). We do it all in this one place * The following two functions set and get the value for the * IPV6_SRC_PREFERENCES socket option. * We only support preferences that are covered by * Look for conflicting preferences or default preferences. If * both bits of a related pair are clear, the application wants the * system's default value for that pair. Both bits in a pair can't * Get the size of the IP options (including the IP headers size) * without including the AH header's size. If till_ah is B_FALSE, * and if AH header is present, dest options beyond AH header will * also be included in the returned size. /* Assume IP has already stripped it */ * If we don't have a AH header to traverse, * return now. This happens normally for * outbound datagrams where we have not inserted * We don't include the AH header's size * to be symmetrical with other cases where * we either don't have a AH header (outbound) * or peek into the AH header yet (inbound and * The destination options header * is not part of the first mblk. * Utility routine that checks if `v6srcp' is a valid address on underlying * interface `ill'. If `ipifp' is non-NULL, it's set to a held ipif * associated with `v6srcp' on success. NOTE: if this is not called from * inside the IPSQ (ill_g_lock is not held), `ill' may be removed from the * group during or after this lookup. pr_addr_dbg(
"ipif_lookup_testaddr_v6: cannot find ipif for "