/*
* This file and its contents are supplied under the terms of the
* Common Development and Distribution License ("CDDL"), version 1.0.
* You may only use this file in accordance with the terms of version
* 1.0 of the CDDL.
*
* A full copy of the text of the CDDL should have accompanied this
* source. A copy of the CDDL is also available via the Internet at
*/
/*
* Copyright 2016 Joyent, Inc.
*/
/*
* -------------------------
* Interrupt Handling Theory
* -------------------------
*
* There are a couple different sets of interrupts that we need to worry about:
*
* - Interrupts from receive queues
* - Interrupts from transmit queues
* - 'Other Interrupts', such as the administrative queue
*
* 'Other Interrupts' are asynchronous events such as a link status change event
* being posted to the administrative queue, unrecoverable ECC errors, and more.
* If we have something being posted to the administrative queue, then we go
* through and process it, because it's generally enabled as a separate logical
* interrupt. Note, we may need to do more here eventually. To re-enable the
* interrupts from the 'Other Interrupts' section, we need to clear the PBA and
* write ENA to PFINT_ICR0.
*
* Interrupts from the transmit and receive queues indicates that our requests
* have been processed. In the rx case, it means that we have data that we
* should take a look at and send up the stack. In the tx case, it means that
* data which we got from MAC has now been sent out on the wire and we can free
* the associated data. Most of the logic for acting upon the presence of this
* data can be found in i40e_transciever.c which handles all of the DMA, rx, and
* tx operations. This file is dedicated to handling and dealing with interrupt
* processing.
*
* All devices supported by this driver support three kinds of interrupts:
*
* o Extended Message Signaled Interrupts (MSI-X)
* o Message Signaled Interrupts (MSI)
* o Legacy PCI interrupts (INTx)
*
* Generally speaking the hardware logically handles MSI and INTx the same and
* restricts us to only using a single interrupt, which isn't the interesting
* case. With MSI-X available, each physical function of the device provides the
* opportunity for multiple interrupts which is what we'll focus on.
*
* --------------------
* Interrupt Management
* --------------------
*
* By default, the admin queue, which consists of the asynchronous other
* interrupts is always bound to MSI-X vector zero. Next, we spread out all of
* the other interrupts that we have available to us over the remaining
* interrupt vectors.
*
* This means that there may be multiple queues, both tx and rx, which are
* mapped to the same interrupt. When the interrupt fires, we'll have to check
* all of them for servicing, before we go through and indicate that the
* interrupt is claimed.
*
* The hardware provides the means of mapping various queues to MSI-X interrupts
* by programming the I40E_QINT_RQCTL() and I4OE_QINT_TQCTL() registers. These
* registers can also be used to enable and disable whether or not the queue is
* a source of interrupts. As part of this, the hardware requires that we
* maintain a linked list of queues for each interrupt vector. While it may seem
* like this is only there for the purproses of ITRs, that's not the case. The
* first queue must be programmed in I40E_QINT_LNKLSTN(%vector) register. Each
* queue defines the next one in either the I40E_QINT_RQCTL or I40E_QINT_TQCTL
* register.
*
* Because we only have a single queue enabled at the moment and we always have
* two interrupts, we do something pretty simple and just know that there's one
* data queue in the interrupt handler. Longer term, we'll need to think harder
* about this, but for the moment it'll have to suffice.
*
* Finally, the individual interrupt vector itself has the ability to be enabled
* and disabled. The overall interrupt is controlled through the
* I40E_PFINT_DYN_CTLN() register. This is used to turn on and off the interrupt
* as a whole.
*
* Note that this means that both the individual queue and the interrupt as a
* whole can be toggled and re-enabled.
*
* -------------------
* Non-MSIX Management
* -------------------
*
* We may have a case where the Operating System is unable to actually allocate
* queue pair and it is bound to the same interrupt with index zero. The
* hardware doesn't allow us access to additional interrupt vectors in these
* we wanted.
*
* In this world, because the interrupts for the admin queue and traffic are
* mixed together, we have to consult ICR0 to determine what has occurred. The
* QINT_TQCTL and QINT_RQCTL registers have a field, 'MSI-X 0 index' which
* allows us to set a specific bit in ICR0. There are up to seven such bits;
* however, we only use the bit 0 and 1 for the rx and tx queue respectively.
* These are contained by the I40E_INTR_NOTX_{R|T}X_QUEUE and
* I40E_INTR_NOTX_{R|T}X_MASK registers respectively.
*
* Unfortunately, these corresponding queue bits have no corresponding entry in
* the ICR0_ENA register. So instead, when enabling interrupts on the queues, we
* end up enabling it on the queue registers rather than on the MSI-X registers.
* In the MSI-X world, because they can be enabled and disabled, this is
* different and the queues can always be enabled and disabled, but the
* interrupts themselves are toggled (ignoring the question of interrupt
* blanking for polling on rings).
*
* Finally, we still have to set up the interrupt linked list, but the list is
* instead rooted at the register I40E_PFINT_LNKLST0, rather than being tied to
* one of the other MSI-X registers.
*
* --------------------
* Interrupt Moderation
* --------------------
*
* The XL710 hardware has three different interrupt moderation registers per
* interrupt. Unsurprisingly, we use these for:
*
* o RX interrupts
* o TX interrupts
* o 'Other interrupts' (link status change, admin queue, etc.)
*
* By default, we throttle 'other interrupts' the most, then TX interrupts, and
* then RX interrupts. The default values for these were based on trying to
* reason about both the importance and frequency of events. Generally speaking
* 'other interrupts' are not very frequent and they're not important for the
* I/O data path in and of itself (though they may indicate issues with the I/O
* data path).
*
* On the flip side, when we're not polling, RX interrupts are very important.
* The longer we wait for them, the more latency that we inject into the system.
* However, if we allow interrupts to occur too frequently, we risk a few
* problems:
*
* 1) Abusing system resources. Without proper interrupt blanking and polling,
* we can see upwards of 200k-300k interrupts per second on the system.
*
* 2) Not enough data coalescing to enable polling. In other words, the more
* data that we allow to build up, the more likely we'll be able to enable
* polling mode and allowing us to better handle bulk data.
*
* In-between the 'other interrupts' and the TX interrupts we have the
* reclamation of TX buffers. This operation is not quite as important as we
* generally size the ring large enough that we should be able to reclaim a
* substantial amount of the descriptors that we have used per interrupt. So
* while it's important that this interrupt occur, we don't necessarily need it
* firing as frequently as RX; it doesn't, on its own, induce additional latency
* into the system.
*
* Based on all this we currently assign static ITR values for the system. While
* we could move to a dynamic system (the hardware supports that), we'd want to
* make sure that we're seeing problems from this that we believe would be
* generally helped by the added complexity.
*
* Based on this, the default values that we have allow for the following
* interrupt thresholds:
*
* o 20k interrupts/s for RX
* o 5k interrupts/s for TX
* o 2k interupts/s for 'Other Interrupts'
*/
#include "i40e_sw.h"
#define I40E_INTR_NOTX_QUEUE 0
#define I40E_INTR_NOTX_INTR 0
#define I40E_INTR_NOTX_RX_QUEUE 0
void
{
int i;
/*
* No matter the interrupt mode, the ITR for other interrupts is always
* on interrupt zero and the same is true if we're not using MSI-X.
*/
if (itr == I40E_ITR_INDEX_OTHER ||
return;
}
}
}
/*
* Re-enable the adminq. Note that the adminq doesn't have a traditional queue
* associated with it from an interrupt perspective and just lives on ICR0.
* However when MSI-X interrupts are not being used, then this also enables and
* disables those interrupts.
*/
static void
{
i40e_flush(hw);
}
static void
{
}
static void
{
}
static void
{
}
/*
* When MSI-X interrupts are being used, then we can enable the actual
* interrupts themselves. However, when they are not, we instead have to turn
* towards the queue's CAUSE_ENA bit and enable that.
*/
void
{
int i;
i40e_intr_io_enable(i40e, i);
}
} else {
}
}
/*
* When MSI-X interrupts are being used, then we can disable the actual
* interrupts themselves. However, when they are not, we instead have to turn
* towards the queue's CAUSE_ENA bit and disable that.
*/
void
{
int i;
i40e_intr_io_disable(i40e, i);
}
} else {
}
}
/*
* As part of disabling the tx and rx queue's we're technically supposed to
* remove the linked list entries. The simplest way is to clear the LNKLSTN
* register by setting it to I40E_QUEUE_TYPE_EOL (0x7FF).
*
* Note all of the FM register access checks are performed by the caller.
*/
void
{
int i;
return;
}
#ifdef DEBUG
/*
* Verify that the interrupt in question is disabled. This is a
* prerequisite of modifying the data in question.
*/
#endif
}
i40e_flush(hw);
}
/*
* Finalize interrupt handling. Mostly this disables the admin queue.
*/
void
{
#ifdef DEBUG
int i;
/*
* Take a look and verify that all other interrupts have been disabled
* and the interrupt linked lists have been zeroed.
*/
}
}
#endif
}
/*
* Enable all of the queues and set the corresponding LNKLSTN registers. Note
* that we always enable queues as interrupt sources, even though we don't
* enable the MSI-X interrupt vectors.
*/
static void
{
/*
* Because we only have a single queue, just do something simple now.
* How this all works will need to really be properly redone based on
* the bit maps, etc. Note that we skip the ITR logic for the moment,
* just to make our lives as explicit and simple as possible.
*/
reg = (0 << I40E_PFINT_LNKLSTN_FIRSTQ_INDX_SHIFT) |
(0 << I40E_QINT_RQCTL_NEXTQ_INDX_SHIFT) |
}
/*
* Set up a single queue to share the admin queue interrupt in the non-MSI-X
* world. Note we do not enable the queue as an interrupt cause at this time. We
* don't have any other vector of control here, unlike with the MSI-X interrupt
* case.
*/
static void
{
}
/*
* Enable the specified queue as a valid source of interrupts. Note, this should
* only be used as part of the GLDv3's interrupt blanking routines. The debug
* build assertions are specific to that.
*/
void
{
}
/*
* Disable the specified queue as a valid source of interrupts. Note, this
* should only be used as part of the GLDv3's interrupt blanking routines. The
* debug build assertions are specific to that.
*/
void
{
}
/*
* Start up the various chip's interrupt handling. We not only configure the
* adminq here, but we also go through and configure all of the actual queues,
* the interrupt linked lists, and others.
*/
void
{
/*
* Ensure that all non adminq interrupts are disabled at the chip level.
*/
/*
* Always enable all of the other-class interrupts to be on their own
* ITR. This only needs to be set on interrupt zero, which has its own
* special setting.
*/
/*
* Enable interrupt types we expect to receive. At the moment, this
* is limited to the adminq; however, we'll want to review 11.2.2.9.22
* for more types here as we add support for detecting them, handling
* them, and resetting the device as appropriate.
*/
/*
* Always set the interrupt linked list to empty. We'll come back and
* change this if MSI-X are actually on the scene.
*/
/*
* Set up all of the queues and map them to interrupts based on the bit
* assignments.
*/
} else {
}
/*
* Finally set all of the default ITRs for the interrupts. Note that the
* queues will have been set up above.
*/
}
static void
{
while (remain != 0) {
/*
* At the moment, the only error code that seems to be returned
* is one saying that there's no work. In such a case we leave
* this be.
*/
if (ret != I40E_SUCCESS)
break;
switch (opcode) {
break;
default:
/*
* Longer term we'll want to enable other causes here
* and get these cleaned up and doing something.
*/
break;
}
}
}
static void
{
itrq->itrq_rxgen);
}
}
static void
{
}
/*
* At the moment, the only 'other' interrupt on ICR0 that we handle is the
* adminq. We should go through and support the other notifications at some
* point.
*/
static void
{
DDI_FM_OK) {
return;
}
if (reg & I40E_PFINT_ICR0_ADMINQ_MASK)
/*
* Make sure that the adminq interrupt is not masked and then explicitly
* enable the adminq and thus the other interrupt.
*/
}
{
/*
* When using MSI-X interrupts, vector 0 is always reserved for the
* adminq at this time. Though longer term, we'll want to also bridge
* some I/O to them.
*/
if (vector_idx == 0) {
return (DDI_INTR_CLAIMED);
}
/*
* Note that we explicitly do not check this value under the lock even
* though assignments to it are done so. In this case, the cost of
* getting this wrong is at worst a bit of additional contention and
* even more rarely, a duplicated packet. However, the cost on the other
* hand is a lot more. This is something that as we more generally
* implement ring support we should revisit.
*/
i40e_intr_rx_work(i40e, 0);
i40e_intr_tx_work(i40e, 0);
return (DDI_INTR_CLAIMED);
}
static uint_t
{
return (DDI_INTR_UNCLAIMED);
}
}
DDI_FM_OK) {
return (DDI_INTR_CLAIMED);
}
if (reg == 0) {
goto done;
}
if (reg & I40E_PFINT_ICR0_ADMINQ_MASK)
if (reg & I40E_INTR_NOTX_RX_MASK)
i40e_intr_rx_work(i40e, 0);
if (reg & I40E_INTR_NOTX_TX_MASK)
i40e_intr_tx_work(i40e, 0);
done:
return (ret);
}
/* ARGSUSED */
{
}
/* ARGSUSED */
{
}