dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * CDDL HEADER START
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * The contents of this file are subject to the terms of the
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * Common Development and Distribution License (the "License").
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * You may not use this file except in compliance with the License.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * or http://www.opensolaris.org/os/licensing.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * See the License for the specific language governing permissions
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * and limitations under the License.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * When distributing Covered Code, include this CDDL HEADER in each
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * file and include the License file at usr/src/OPENSOLARIS.LICENSE.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * If applicable, add the following below this CDDL HEADER, with the
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * fields enclosed by brackets "[]" replaced with your own identifying
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * information: Portions Copyright [yyyy] [name of copyright owner]
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * CDDL HEADER END
47b75f87aad8081805fc42779dae59a3ee1de59dKacheong Poon * Copyright 2010 Sun Microsystems, Inc. All rights reserved.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * Use is subject to license terms.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * NAT source entry garbarge collection timeout. The actual timeout value
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * includes a random jitter bounded by the ILB_NAT_SRC_TIMEOUT_JITTER.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra/* key1/2 are assumed to be uint32_t. */
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra#define ILB_NAT_SRC_HASH(hash, key1, key2, hash_size) \
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra CRC32((hash), (key1), sizeof (uint32_t), -1U, crc32_table); \
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra CRC32((hash), (key2), sizeof (uint32_t), (hash), crc32_table); \
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra/* NAT source port space instance number. */
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra if (++i != 0) {
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra if (++i != 0) {
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra if (++i != 0) {
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * When ILB does full NAT, it first picks one source address from the rule's
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * specified NAT source address list (currently done in round robin fashion).
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * Then it needs to allocate a port. This source port must make the tuple
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * (source address:source port:destination address:destination port)
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * unique. The destination part of the tuple is determined by the back
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * end server, and could not be changed.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * To handle the above source port number allocation, ILB sets up a table
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * of entries identified by source address:back end server address:server port
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * tuple. This table is used by all rules for NAT source port allocation.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * Each tuple has an associated vmem arena used for managing the NAT source
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * port space between the source address and back end server address/port.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * Each back end server (ilb_server_t) has an array of pointers (iser_nat_src)
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * to the different entries in this table for NAT source port allocation.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * When ILB needs to allocate a NAT source address and port to talk to a back
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * end server, it picks a source address and uses the array pointer to get
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * to an entry. Then it calls vmem_alloc() on the associated vmem arena to
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * find an unused port.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * When a back end server is added, ILB sets up the aforementioned array.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * For each source address specified in the rule, ILB checks if there is any
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * existing entry which matches this source address:back end server address:
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * port tuple. The server port is either a specific port or 0 (meaning wild
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * card port). Normally, a back end server uses the same port as in the rule.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * If a back end server is used to serve two different rules, there will be
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * two different ports. Source port allocation for these two rules do not
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * conflict, hence we can use two vmem arenas (two different entries in the
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * table). But if a server uses port range in one rule, we will treat it as
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * a wild card port. Wild card poart matches with any port. If this server
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * is used to serve more than one rules and those rules use the same set of
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * NAT source addresses, this means that they must share the same set of vmem
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * arenas (source port spaces). We do this for simplicity reason. If not,
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * we need to partition the port range so that we can identify different forms
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * of source port number collision.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * NAT source address initialization routine.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra ilbs->ilbs_nat_src = kmem_zalloc(sizeof (ilb_nat_src_hash_t) *
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra for (i = 0; i < ilbs->ilbs_nat_src_hash_size; i++) {
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra mutex_init(&ilbs->ilbs_nat_src[i].nsh_lock, NULL,
47b75f87aad8081805fc42779dae59a3ee1de59dKacheong Poon ilbs->ilbs_nat_src_tid = timeout(ilb_nat_src_timer, ilbs,
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * NAT source address clean up routine.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * By setting ilbs_nat_src_tid to 0, the timer handler will not
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * restart the timer.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra for (i = 0; i < ilbs->ilbs_nat_src_hash_size; i++) {
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra while ((cur = list_remove_head(&ilbs->ilbs_nat_src[i].nsh_head))
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra kmem_free(cur, sizeof (ilb_nat_src_entry_t));
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra mutex_destroy(&ilbs->ilbs_nat_src[i].nsh_lock);
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra kmem_free(ilbs->ilbs_nat_src, sizeof (ilb_nat_src_hash_t) *
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra/* An arena name is "ilb_ns" + "_xxxxxxxxxx" */
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * Check if the NAT source and back end server pair ilb_nat_src_entry_t
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * exists. If it does, increment the refcnt and return it. If not, create
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * one and return it.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misrailb_find_nat_src(ilb_stack_t *ilbs, const in6_addr_t *nat_src,
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra ILB_NAT_SRC_HASH(idx, &nat_src->s6_addr32[3], &serv_addr->s6_addr32[3],
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra mutex_enter(&ilbs->ilbs_nat_src[idx].nsh_lock);
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra for (tmp = list_head(head); tmp != NULL; tmp = list_next(head, tmp)) {
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra if (IN6_ARE_ADDR_EQUAL(&tmp->nse_src_addr, nat_src) &&
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra IN6_ARE_ADDR_EQUAL(&tmp->nse_serv_addr, serv_addr) &&
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra /* Found one, return it. */
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra mutex_exit(&ilbs->ilbs_nat_src[idx].nsh_lock);
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra tmp = kmem_alloc(sizeof (ilb_nat_src_entry_t), KM_NOSLEEP);
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra mutex_exit(&ilbs->ilbs_nat_src[idx].nsh_lock);
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra tmp->nse_nsh_lock = &ilbs->ilbs_nat_src[idx].nsh_lock;
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra (void) snprintf(arena_name, ARENA_NAMESZ, "ilb_ns_%u",
1a5e258f5471356ca102c7176637cdce45bac147Josef 'Jeff' Sipek atomic_inc_32_nv(&ilb_nat_src_instance));
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra if ((tmp->nse_port_arena = vmem_create(arena_name,
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra (void *)NAT_PORT_START, NAT_PORT_SIZE, 1, NULL, NULL, NULL, 1,
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra mutex_exit(&ilbs->ilbs_nat_src[idx].nsh_lock);
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * Create ilb_nat_src_t struct for a ilb_server_t struct.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misrailb_create_nat_src(ilb_stack_t *ilbs, ilb_nat_src_t **nat_src,
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra const in6_addr_t *srv_addr, in_port_t port, const in6_addr_t *start,
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra if ((src = kmem_zalloc(sizeof (ilb_nat_src_t), KM_NOSLEEP)) == NULL) {
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra for (i = 0; i < num && i < ILB_MAX_NAT_SRC; i++) {
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra src->src_list[i] = ilb_find_nat_src(ilbs, &cur_addr, srv_addr,
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * Increment num_src here so that we can call
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * ilb_destroy_nat_src() when we need to do cleanup.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * Timer routine for garbage collecting unneeded NAT source entry. We
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * don't use a taskq for this since the table should be relatively small
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * and should be OK for a timer to handle.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra for (i = 0; i < ilbs->ilbs_nat_src_hash_size; i++) {
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra mutex_enter(&ilbs->ilbs_nat_src[i].nsh_lock);
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * When a server is removed, it will release its
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * reference on an entry. But there may still be
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * conn using some ports. So check the size also.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra vmem_size(cur->nse_port_arena, VMEM_ALLOC) != 0) {
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra kmem_free(tmp, sizeof (ilb_nat_src_entry_t));
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra ilbs->ilbs_nat_src_tid = timeout(ilb_nat_src_timer, ilbs,
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * Destroy a given ilb_nat_src_t struct. It will also release the reference
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * hold on all its ilb_nat_src_entry_t.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * Set each entry to be condemned and the garbarge collector will
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * clean them up.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra for (i = 0; i < size; i++) {
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * Given a backend server address and its ilb_nat_src_t, allocate a source
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * address and port for NAT usage.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misrailb_alloc_nat_addr(ilb_nat_src_t *src, in6_addr_t *addr, in_port_t *port,
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra /* Increment of cur does not need to be atomic. It is just a hint. */
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra src->src_list[i]->nse_port_arena, 1, VM_NOSLEEP);
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * If an index is given and we cannot allocate a port using
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * that entry, return NULL.
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra * Use the pre-calculated checksum to adjust the checksum of a packet after
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra adj_sum = (adj_sum & 0xffff) + (adj_sum >> 16);
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra/* Do full NAT (replace both source and desination info) on a packet. */
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misrailb_full_nat(int l3, void *iph, int l4, void *tph, ilb_nat_info_t *info,
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra uint32_t adj_ip_sum, uint32_t adj_tp_sum, boolean_t c2s)
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra orig_sport = &((udpha_t *)tph)->uha_src_port;
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra orig_dport = &((udpha_t *)tph)->uha_dst_port;
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra IN6_V4MAPPED_TO_IPADDR(&info->vip, ipha->ipha_src);
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra IN6_V4MAPPED_TO_IPADDR(&info->src, ipha->ipha_dst);
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra adj_cksum(&ipha->ipha_hdr_checksum, adj_ip_sum);
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra /* No checksum for IPv6 header */
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra/* Do half NAT (only replace the destination info) on a packet. */
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misrailb_half_nat(int l3, void *iph, int l4, void *tph, ilb_nat_info_t *info,
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra uint32_t adj_ip_sum, uint32_t adj_tp_sum, boolean_t c2s)
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra IN6_V4MAPPED_TO_IPADDR(&info->vip, ipha->ipha_src);
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra adj_cksum(&ipha->ipha_hdr_checksum, adj_ip_sum);
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra /* No checksum for IPv6 header */
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra/* Calculate the IPv6 pseudo checksum, used for ICMPv6 NAT. */
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misrailb_pseudo_sum_v6(ip6_t *ip6h, uint8_t nxt_hdr)
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra sum = cur[0] + cur[1] + cur[2] + cur[3] + cur[4] + cur[5] + cur[6] +
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra cur[7] + cur[8] + cur[9] + cur[10] + cur[11] + cur[12] + cur[13] +
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra/* Do NAT on an ICMPv4 packet. */
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misrailb_nat_icmpv4(mblk_t *mp, ipha_t *out_iph, icmph_t *icmph, ipha_t *in_iph,
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra in_port_t *sport, in_port_t *dport, ilb_nat_info_t *info, uint32_t sum,
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra IN6_V4MAPPED_TO_IPADDR(&info->nat_src, out_iph->ipha_src);
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra IN6_V4MAPPED_TO_IPADDR(&info->nat_src, in_iph->ipha_dst);
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra IN6_V4MAPPED_TO_IPADDR(&info->nat_dst, out_iph->ipha_dst);
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra IN6_V4MAPPED_TO_IPADDR(&info->nat_dst, in_iph->ipha_src);
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra icmph->icmph_checksum = IP_CSUM(mp, IPH_HDR_LENGTH(out_iph), 0);
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra/* Do NAT on an ICMPv6 packet. */
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misrailb_nat_icmpv6(mblk_t *mp, ip6_t *out_ip6h, icmp6_t *icmp6h, ip6_t *in_ip6h,
dbed73cbda2229fd1aa6dc5743993cae7f0a7ee9Sangeeta Misra in_port_t *sport, in_port_t *dport, ilb_nat_info_t *info,