idm.c revision 4142b486074471a886d6d0f0aa625a38b03d4eba
2058N/A * The contents of this file are subject to the terms of the 2058N/A * Common Development and Distribution License (the "License"). 2058N/A * You may not use this file except in compliance with the License. 2058N/A * See the License for the specific language governing permissions 2058N/A * and limitations under the License. 2058N/A * When distributing Covered Code, include this CDDL HEADER in each 2058N/A * If applicable, add the following below this CDDL HEADER, with the 2058N/A * fields enclosed by brackets "[]" replaced with your own identifying 2058N/A * information: Portions Copyright [yyyy] [name of copyright owner] 2058N/A * Copyright 2009 Sun Microsystems, Inc. All rights reserved. 2058N/A * Use is subject to license terms. * Potential tuneable for the maximum number of tasks. Default to * Global list of transport handles * These are listed in preferential order, so we can simply take the * first "it_conn_is_capable" hit. Note also that the order maps to * the order of the idm_transport_type_t list. /* iSER on InfiniBand transport handle */ NULL,
/* transport ops */ NULL},
/* transport caps */ /* IDM native sockets transport handle */ NULL,
/* transport ops */ NULL}
/* transport caps */ * idm_transport_register() * Provides a mechanism for an IDM transport driver to register its * transport ops and caps with the IDM kernel module. Invoked during * a transport driver's attach routine. /* All known non-native transports here; for now, iSER */ * This function is invoked by the iSCSI layer to create a connection context. * This does not actually establish the socket connection. * cr - Connection request parameters * new_con - Output parameter that contains the new request if successful /* create the transport-specific connection components */ /* cleanup the failed connection */ * It is possible for an IB client to connect to * an ethernet-only client via an IB-eth gateway. * Therefore, if we are attempting to use iSER and * fail, retry with sockets before ultimately * failing the connection. * Releases any resources associated with the connection. This is the * complement to idm_ini_conn_create. * ic - idm_conn_t structure representing the relevant connection * It's reasonable for the initiator to call idm_ini_conn_destroy * from within the context of the CN_CONNECT_DESTROY notification. * That's a problem since we want to destroy the taskq for the * state machine associated with the connection. Remove the * connection from the list right away then handle the remaining * work via the idm_global_taskq. "idm_ini_conn_destroy: Couldn't dispatch task");
* Establish connection to the remote system identified in idm_conn_t. * The connection parameters including the remote IP address were established * in the call to idm_ini_conn_create. The IDM state machine will * perform client notifications as necessary to prompt the initiator through * the login process. IDM also keeps a timer running so that if the login * process doesn't complete in a timely manner it will fail. * ic - idm_conn_t structure representing the relevant connection * Returns success if the connection was established, otherwise some kind * of meaningful error code. * Upon return the login has either failed or is loggin in (ffp) /* Hold connection until we return */ /* Wait for login flag */ /* ic->ic_conn_sm_status will contains failure status */ * idm_ini_conn_disconnect * Forces a connection (previously established using idm_ini_conn_connect) * to perform a controlled shutdown, cleaning up any outstanding requests. * ic - idm_conn_t structure representing the relevant connection * This is asynchronous and will return before the connection is properly * idm_ini_conn_disconnect_wait * Forces a connection (previously established using idm_ini_conn_connect) * to perform a controlled shutdown. Blocks until the connection is * ic - idm_conn_t structure representing the relevant connection * The target calls this service to obtain a service context for each available * transport, starting a service of each type related to the IP address and port * passed. The idm_svc_req_t contains the service parameters. /* Initialize transport-agnostic components of the service handle */ * Make sure all available transports are setup. We call this now * instead of at initialization time in case IB has become available * since we started (hotplug, etc). * Loop through the transports, configuring the transport-specific * components of each one. * If it_ops is NULL then the transport is unconfigured * and we shouldn't try to start the service. /* Teardown any configured services */ /* Free the svc context and return */ * is - idm_svc_t returned by the call to idm_tgt_svc_create * Cleanup any resources associated with the idm_svc_t. /* remove this service from the global list */ /* wakeup any waiters for service change */ /* teardown each transport-specific service */ /* tear down the svc resources */ /* free the svc handle */ * is - idm_svc_t returned by the call to idm_tgt_svc_create * Online each transport service, as we want this target to be accessible * via any configured transport. * When the initiator establishes a new connection to the target, IDM will * call the "new connect" callback defined in the idm_svc_req_t structure * and it will pass an idm_conn_t structure representing that new connection. /* Walk through each of the transports and online them */ /* transport is not registered */ * The last transport failed to online. * Offline any transport onlined above and * do not online the target. /* transport is not registered */ /* Target service now online */ /* Target service already online, just bump the count */ * is - idm_svc_t returned by the call to idm_tgt_svc_create * Shutdown any online target services. /* Walk through each of the transports and offline them */ /* transport is not registered */ * Lookup a service instance listening on the specified port * A service exists on this port, but it * is going away, wait for it to cleanup. * idm_negotiate_key_values() * Give IDM level a chance to negotiate any login parameters it should own. * -- leave unhandled parameters alone on request_nvl * -- move all handled parameters to response_nvl with an appropriate response * -- also add an entry to negotiated_nvl for any accepted parameters * idm_notice_key_values() * Activate at the IDM level any parameters that have been negotiated. * Passes the set of key value pairs to the transport for activation. * This will be invoked as the connection is entering full-feature mode. * This is IDM's implementation of the 'Put_Data' operational primitive. * This function is invoked by a target iSCSI layer to request its local * Datamover layer to transmit the Data-In PDU to the peer iSCSI layer * on the remote iSCSI node. The I/O buffer represented by 'idb' is * transferred to the initiator associated with task 'idt'. The connection * info, contents of the Data-In PDU header, the DataDescriptorIn, BHS, * and the callback (idb->idb_buf_cb) at transfer completion are * This data transfer takes place transparently to the remote iSCSI layer, * i.e. without its participation. * Using sockets, IDM implements the data transfer by segmenting the data * buffer into appropriately sized iSCSI PDUs and transmitting them to the * initiator. iSER performs the transfer using RDMA write. * Buffer should not contain the pattern. If the pattern is * present then we've been asked to transmit initialized data * Bind buffer but don't start a transfer since the task * Once the task is aborted, any buffers added to the * idt_inbufv will never get cleaned up, so just return * SUCCESS. The buffer should get cleaned up by the * client or framework once task_aborted has completed. * This is IDM's implementation of the 'Get_Data' operational primitive. * This function is invoked by a target iSCSI layer to request its local * Datamover layer to retrieve certain data identified by the R2T PDU from the * peer iSCSI layer on the remote node. The retrieved Data-Out PDU will be * mapped to the respective buffer by the task tags (ITT & TTT). * The connection information, contents of an R2T PDU, DataDescriptor, BHS, and * the callback (idb->idb_buf_cb) notification for data transfer completion are * When an iSCSI node sends an R2T PDU to its local Datamover layer, the local * Datamover layer, the local and remote Datamover layers transparently bring * about the data transfer requested by the R2T PDU, without the participation * Using sockets, IDM transmits an R2T PDU for each buffer and the rx_data_out() * assembles the Data-Out PDUs into the buffer. iSER uses RDMA read. * "In" buf list is for "Data In" PDU's, "Out" buf list is for * Bind buffer but don't start a transfer since the task * The transport calls this after it has completed a transfer requested by * a call to transport_buf_tx_to_ini * Caller holds idt->idt_mutex, idt->idt_mutex is released before returning. * idt may be freed after the call to idb->idb_buf_cb. * idm_refcnt_rele may cause TASK_SUSPENDING --> TASK_SUSPENDED or * TASK_ABORTING --> TASK_ABORTED transistion if the refcount goes * To keep things simple we will ignore the case where the * transfer was successful and leave all buffers bound to the * task. This allows us to also ignore the case where we've * been asked to abort a task but the last transfer of the * task has completed. IDM has no idea whether this was, in * fact, the last transfer of the task so it would be difficult * to handle this case. Everything should get sorted out again * after task reassignment is complete. * In the case of TASK_ABORTING we could conceivably call the * buffer callback here but the timing of when the client's * client_task_aborted callback is invoked vs. when the client's * buffer callback gets invoked gets sticky. We don't want * the client to here from us again after the call to * client_task_aborted() but we don't want to give it a bunch * of failed buffer transfers until we've called * client_task_aborted(). Instead we'll just leave all the * buffers bound and allow the client to cleanup. * idm_buf_rx_from_ini_done * The transport calls this after it has completed a transfer requested by * a call totransport_buf_tx_to_ini * Caller holds idt->idt_mutex, idt->idt_mutex is released before returning. * idt may be freed after the call to idb->idb_buf_cb. * idm_refcnt_rele may cause TASK_SUSPENDING --> TASK_SUSPENDED or * TASK_ABORTING --> TASK_ABORTED transistion if the refcount goes * Buffer should not contain the pattern. If it does then * we did not get the data from the remote host. * To keep things simple we will ignore the case where the * transfer was successful and leave all buffers bound to the * task. This allows us to also ignore the case where we've * been asked to abort a task but the last transfer of the * task has completed. IDM has no idea whether this was, in * fact, the last transfer of the task so it would be difficult * to handle this case. Everything should get sorted out again * after task reassignment is complete. * In the case of TASK_ABORTING we could conceivably call the * buffer callback here but the timing of when the client's * client_task_aborted callback is invoked vs. when the client's * buffer callback gets invoked gets sticky. We don't want * the client to here from us again after the call to * client_task_aborted() but we don't want to give it a bunch * of failed buffer transfers until we've called * client_task_aborted(). Instead we'll just leave all the * buffers bound and allow the client to cleanup. * Allocates a buffer handle and registers it for use with the transport * layer. If a buffer is not passed on bufptr, the buffer will be allocated * ic - connection on which the buffer will be transferred * bufptr - allocate memory for buffer if NULL, else assign to buffer * buflen - length of buffer * Returns idm_buf_t handle if successful, otherwise NULL /* Don't allocate new buffers if we are not in FFP */ * If bufptr is NULL, we have an implicit request to allocate * memory for this IDM buffer handle and register it for use * with the transport. To simplify this, and to give more freedom * to the transport layer for it's own buffer management, both of * these actions will take place in the transport layer. * If bufptr is set, then the caller has allocated memory (or more * likely it's been passed from an upper layer), and we need only * register the buffer for use with the transport layer. * Allocate a buffer from the transport layer (which * will also register the buffer for use). /* Set the bufalloc'd flag */ * For large transfers, Set the passed bufptr into * the buf handle, and register the handle with the * transport layer. As memory registration with the * transport layer is a time/cpu intensive operation, * for small transfers (up to a pre-defined bcopy * threshold), use pre-registered memory buffers * and bcopy data at the appropriate time. * The transport layer is now expected to set the idb_bufalloc * correctly to indicate if resources have been allocated. * Release a buffer handle along with the associated buffer that was allocated * or assigned with idm_buf_alloc * This function associates a buffer with a task. This is only for use by the * iSCSI initiator that will have only one buffer per transfer direction * For small transfers, the iSER transport delegates the IDM * layer to bcopy the SCSI Write data for faster IOPS. * For small transfers, the iSER transport delegates the IDM * layer to bcopy the SCSI Read data into the read buufer * idm_buf_find() will lookup the idm_buf_t based on the relative offset in the /* iterate through the list to find the buffer */ * Don't check the pattern in buffers that came from outside IDM * (these will be buffers from the initiator that we opted not * Return true if we find the pattern anywhere in the buffer "bufpat_idb=%p bufmagic=%08x offset=%08x",
* This function will allocate a idm_task_t structure. A task tag is also * generated and saved in idt_tt. The task is not active. /* Don't allocate new tasks if we are not in FFP */ * Mark the task active and initialize some stats. The caller * sets up the idm_task_t structure with a prior call to idm_task_alloc(). * The task service does not function as a task/work engine, it is the * responsibility of the initiator to start the data transfer and free the /* mark the task as ACTIVE */ * This function sets the state to indicate that the task is no longer active. * Although unlikely it is possible for a reference to come in after * the client has decided the task is over but before we've marked * the task idle. One specific unavoidable scenario is the case where * received PDU with the matching ITT/TTT results in a successful * lookup of this task. We are at the mercy of the remote node in * that case so we need to handle it. Now that the task state * has changed no more references will occur so a simple call to * idm_refcnt_wait_ref should deal with the situation. * This function will free the Task Tag and the memory allocated for the task * idm_task_done should be called prior to this call * It's possible for items to still be in the idt_inbufv list if * they were added after idm_task_cleanup was called. We rely on * STMF to free all buffers associated with the task however STMF * doesn't know that we have this reference to the buffers. * Use list_create so that we don't end up with stale references * common code for idm_task_find() and idm_task_find_and_complete() * Must match both itt and ttt. The table is indexed by itt * for initiator connections and ttt for target connections. * Task doesn't match or task is aborting and * we don't want any more references. "idm_task_find: wrong connection %p != %p",
* Set the task state to TASK_COMPLETE so it can no longer * This function looks up a task by task tag. * This function looks up a task by task tag. If found, the task state * is atomically set to TASK_COMPLETE so it can longer be found or aborted. * idm_task_find_by_handle * This function looks up a task by the client-private idt_client_handle. * This function should NEVER be called in the performance path. It is * Task is either in suspend, abort, or already * Passing NULL as the task indicates that all tasks * for this connection should be aborted. * Only the connection state machine should ask for * all tasks to abort and this should never happen in FFP. * Caller must hold the task mutex, which will be released before return /* Caller must hold connection mutex */ /* Call transport to release any resources */ * Wait for outstanding references. When all * references are released the callback will call * Wait for outstanding references. When all * references are released the callback will call /* Already called transport_free_task_rsrc(); */ /* Already called transport_free_task_rsrc(); */ * We could probably call idm_task_aborted directly * here but we may be holding the conn lock. It's * easier to just switch contexts. Even though * we shouldn't really have any references we'll * set the state to TASK_ABORTING instead of * TASK_ABORTED so we can use the same code path. /* We're already past this point... */ * In this case, let it go. The status has already been * sent (which may or may not get successfully transmitted) * and we don't want to end up in a race between completing * the status PDU and marking the task suspended. * Remove all the buffers from the task and add them to a * temporary local list -- we do this so that we can hold * the task lock and prevent the task from going away if * This could happen during abort in iscsit. * This is IDM's implementation of the 'Send_Control' operational primitive. * This function is invoked by an initiator iSCSI layer requesting the transfer * of a iSCSI command PDU or a target iSCSI layer requesting the transfer of a * iSCSI response PDU. The PDU will be transmitted as-is by the local Datamover * layer to the peer iSCSI layer in the remote iSCSI node. The connection info * and iSCSI PDU-specific qualifiers namely BHS, AHS, DataDescriptor and Size * If we are in full-featured mode then route SCSI-related * commands to the appropriate function vector without checking * the connection state. We will only be in full-feature mode * when we are in an acceptable state for SCSI PDU's. * We also need to ensure that there are no PDU events outstanding * on the state machine. Any non-SCSI PDU's received in full-feature * mode will result in PDU events and until these have been handled * we need to route all PDU's through the state machine as PDU * events to maintain ordering. * Note that IDM cannot enter FFP mode until it processes in * its state machine the last xmit of the login process. * Hence, checking the IDM_PDU_LOGIN_TX flag here would be * Any PDU's processed outside of full-feature mode and non-SCSI * PDU's in full-feature mode are handled by generating an * event to the connection state machine. The state machine * will validate the PDU against the current state and either * transmit the PDU if the opcode is allowed or handle an * error if the PDU is not allowed. * This code-path will also generate any events that are implied * by the PDU opcode. For example a "login response" with success * status generates a CE_LOGOUT_SUCCESS_SND event. * Connection state machine will validate these PDU's against * the current state. A PDU not allowed in the current * state will cause a protocol error. * Common allocation of a PDU along with memory for header and data. * IDM clients should cache these structures for performance * critical paths. We can't cache effectively in IDM because we * don't know the correct header and data size. * Valid header length is assumed to be hdrlen and valid data * length is assumed to be datalen. isp_hdrlen and isp_datalen * can be adjusted after the PDU is returned if necessary. /* For idm_pdu_free sanity check */ * Typical idm_pdu_alloc invocation, will block for resources. * Non-blocking idm_pdu_alloc implementation, returns NULL if resources * are not available. Needed for transport-layer allocations which may * be invoking in interrupt context. * Free a PDU previously allocated with idm_pdu_alloc() including any * header and data space allocated as part of the original request. * Additional memory regions referenced by subsequent modification of * the isp_hdr and/or isp_data fields will not be freed. /* Make sure the structure was allocated using idm_pdu_alloc() */ * Initialize the connection, private and callback fields in a PDU. * idm_pdu_complete() will call idm_pdu_free if the callback is * NULL. This will only work if the PDU was originally allocated * Initialize the header and header length field. This function should * not be used to adjust the header length in a buffer allocated via * pdu_pdu_alloc since it overwrites the existing header pointer. * Initialize the data and data length fields. This function should * not be used to adjust the data length of a buffer allocated via * idm_pdu_alloc since it overwrites the existing data pointer. * Object reference tracking * Grab the mutex to there are no other lingering threads holding * the mutex before we destroy it (e.g. idm_refcnt_rele just after * the refcnt goes to zero if ir_waiting == REF_WAIT_ASYNC) * Nothing should take a hold on an object after a call to * idm_refcnt_wait_ref or idm_refcnd_async_wait_ref /* No one is waiting on this object */ * Someone is waiting for this object to go idle so check if * refcnt is 0. Waiting on an object then later grabbing another * reference is not allowed so we don't need to handle that case. "idm_refcnt_rele: Couldn't dispatch task");
* Someone is waiting for this object to go idle so check if * refcnt is 0. Waiting on an object then later grabbing another * reference is not allowed so we don't need to handle that case. "idm_refcnt_rele: Couldn't dispatch task");
* It's possible we don't have any references. To make things easier * on the caller use a taskq to call the callback instead of * calling it synchronously "idm_refcnt_async_wait_ref: " "Couldn't dispatch task");
/* Initialize the rwlock for the taskid table */ /* Initialize the global mutex and taskq */ * The maximum allocation needs to be high here since there can be * many concurrent tasks using the global taskq. /* Start watchdog thread */ /* Couldn't create the watchdog thread */ /* Pause until the watchdog thread is running */ * Allocate the task ID table and set "next" to 0. /* Create the global buffer and task kmem caches */ * Note, we're explicitly allocating an additional iSER header- * sized chunk for each of these elements. See idm_task_constructor(). /* Create the service and connection context lists */ /* Initialize the native sockets transport */ /* Create connection ID pool */ /* Close any LDI handles we have open on transport drivers */ /* Teardown the native sockets transport */