drmach.c revision 03831d35f7499c87d51205817c93e9a8d42c4bae
* DRMACH_MEM_SLICE_SIZE and DRMACH_MEM_USABLE_SLICE_SIZE define the * available address space and the usable address space for every slice. * There must be a distinction between the available and usable do to a * restriction imposed by CDC memory size. * The Cheetah's Safari Configuration Register and the Schizo's * Safari Control/Status Register place the LPA base and bound fields in * same bit locations with in their register word. This source code takes * which are shared by various Cheetah and Schizo drmach routines. * Name properties for frequently accessed device nodes. * Maximum value of processor Safari Timeout Log (TOL) field of * Safari Config reg (7 secs). * drmach_board_t flag definitions * The following global is read as a boolean value, non-zero is true. * If zero, DR copy-rename and cpu poweron will not set the processor * LPA settings (CBASE, CBND of Safari config register) to correspond * to the current memory slice map. LPAs of processors present at boot * will remain as programmed by POST. LPAs of processors on boards added * by DR will remain NULL, as programmed by POST. This can be used to * to override the per-board L1SSFLG_THIS_L1_NULL_PROC_LPA flag set by * POST in the LDCD (and copied to the GDCD by SMS). * drmach_reprogram_lpa and L1SSFLG_THIS_L1_NULL_PROC_LPA do not apply * to Schizo device LPAs. These are always set by DR. * There is a known HW bug where a Jaguar CPU in Safari port 0 (SBX/P0) * can fail to receive an XIR. To workaround this issue until a hardware * fix is implemented, we will exclude the selection of these CPUs. * Setting this to 0 will allow their selection again. * Setting to non-zero will enable delay before all disconnect ops. * Default delay is slightly greater than the max processor Safari timeout. * This delay is intended to ensure the outstanding Safari activity has * retired on this board prior to a board disconnect. * By default, DR of non-Panther procs is not allowed into a Panther * domain with large page sizes enabled. Setting this to 0 will remove * Used to pass updated LPA values to procs. * Protocol is to clear the array before use. int drmach_debug = 0;
/* set to non-zero to enable debug messages */ * drmach autoconfiguration data structures and interfaces * drmach_boards_rwlock is used to synchronize read/write * access to drmach_boards array between status and board lookup * as READERS, and assign, and unassign threads as WRITERS. * drmach_node_* routines serve the purpose of separating the * rest of the code from the device tree and OBP. This is necessary * because of In-Kernel-Probing. Devices probed after stod, are probed * by the in-kernel-prober, not OBP. These devices, therefore, do not static char *
fn =
"drmach_node_obp_get_parent";
* dip doesn't have to be held here as we are called * from ddi_walk_devs() which holds the dip. * Set "here" to NULL so that unheld dip is not accessible * outside ddi_walk_devs() /* initialized args structure for callback */ * Root node doesn't have to be held in any way. /* initialized args structure for callback */ /* save our new position within the tree */ /* save our new position within the tree */ static char *
fn =
"drmach_node_ddi_get_parent";
* Check if a CPU node is part of a CMP. * The branch rooted at dip will have been previously * held, or it will be the child of a CMP. In either * case, the hold acquired in e_ddi_nodeid_to_dip() static char *
fn =
"drmach_node_ddi_get_prop";
static char *
fn =
"drmach_node_obp_get_prop";
rv = (*
len < 0 ? -
1 : 0);
* drmach_array provides convenient array construction, access, * bounds checking and array destruction logic. /* clear the array entry */ /* read the gdcd, bail if magic or ver #s are not what is expected */ * On Starcat, there is no CPU driver, so it is * not necessary to configure any CPU nodes. * We held this branch earlier, so at a minimum its * root should still be present in the device tree. DRMACH_PR(
"drmach_configure: configuring DDI branch");
* Record first failure but don't stop * If non-NULL, fdip is returned held and must be /* every node is expected to have a name */ "dip: 0x%p: property %s",
* Not a node of interest to dr - including "cmp", * but it is in drmach_name2type[], which lets gptwocfg * driver to check if node is OBP created. * Derive a best-guess unit number from the portid value. * Some drmach_*_new constructors (drmach_pci_new, for example) * will overwrite the prototype unum value with one that is more * appropriate for the device. * we need to know if the board's connected before * issuing a showboard message. If it's connected, we just * reply with status composed of cached info DRMACH_PR(
"Unknown test status=0x%x from SC\n",
int o_nretry;
/* number of sending retries */ int f_error;
/* mailbox framework error */ m_reply :
1,
/* msg reply received */ * Timeout values (in seconds) used when waiting for replies (from the SC) to * requests that we sent. Since we only receive boardevent messages, and they * are events rather than replies, there is no boardevent timeout. * Delay (in seconds) used after receiving a non-transient error indication from * an mboxsc_getmsg call in the thread that loops waiting for incoming messages. * Timeout values (in milliseconds) for mboxsc_putmsg and mboxsc_getmsg calls. * Normally, drmach_to_putmsg is set dynamically during initialization in * drmach_mbox_init. This has the potentially undesirable side effect of * clobbering any value that might have been set in /etc/system. To prevent * dynamic setting of drmach_to_putmsg (thereby allowing it to be tuned in * /etc/system), set drmach_use_tuned_putmsg_to to 1. /* maximum conceivable message size for future mailbox protocol versions */ for (i = 0; i <
18; ++i) {
for (i = 0; i <
18; ++i) {
for (i = 0; i <
18; ++i) {
DRMACH_PR(
"\tmemaddrhi=0x%x memaddrlo=0x%x ",
DRMACH_PR(
"\tmemaddrhi=0x%x memaddrlo=0x%x ",
DRMACH_PR(
"\trecovered=0x%x test status=0x%x\n",
DRMACH_PR(
"\tmemaddrhi=0x%x memaddrlo=0x%x ",
DRMACH_PR(
": active=%d t_status=%d t_level=%d ",
DRMACH_PR(
"dr hdr:\n\tid=0x%x vers=0x%x cmd=0x%x exp=0x%x slot=0x%x\n",
* Callback function passed to taskq_dispatch when a mailbox reinitialization * handshake needs to be scheduled. The handshake can't be performed by the * thread that determines it is needed, in most cases, so this function is * dispatched on the system-wide taskq pool of threads. Failure is reported but * otherwise ignored, since any situation that requires a mailbox initialization * handshake will continue to request the handshake until it succeeds. DRMACH_PR(
"scheduled mailbox reinit running\n");
/* need to initialize the mailbox */ "mbox_init: MBOX_INIT failed ecode=0x%x",
* To ensure sufficient compatibility with future versions of the DR mailbox * protocol, we use a buffer that is large enough to receive the largest message * that could possibly be sent to us. However, since that ends up being fairly * large, allocating it on the stack is a bad idea. Fortunately, this function * does not need to be MT-safe since it is only invoked by the mailbox * framework, which will never invoke it multiple times concurrently. Since * that is the case, we can use a static buffer. /* don't try to interpret anything with the wrong version number */ /* schedule a reinit handshake if one isn't pending */ "failed to schedule mailbox reinit");
"Unsolicited mboxsc_getmsg failed: err=0x%x code=0x%x",
/* check for initialization event */ /* schedule a reinit handshake if one isn't pending */ "failed to schedule mailbox reinit");
/* anything else will be a log_sysevent call */ DRMACH_PR(
"Slot Unavailable event received");
* unlink an entry from the message transaction list * caller must hold drmach_msglist_mutex /* get a reply message */ * If mboxsc_getmsg returns ETIMEDOUT or EAGAIN, then * the "error" is really just a normal, transient * condition and we can retry the operation right away. * Any other error suggests a more serious problem, * ranging from a message being too big for our buffer * (EMSGSIZE) to total failure of the mailbox layer. * This second class of errors is much less "transient", * so rather than retrying over and over (and getting * the same error over and over) as fast as we can, * we'll sleep for a while before retrying. "mboxsc_getmsg failed, err=0x%x",
err);
"mailbox version mismatch 0x%x vs 0x%x",
/* schedule a reinit handshake if one isn't pending */ "failed to schedule mailbox reinit");
* Search through the list to find entries awaiting /* setup transaction list entry */ /* send mailbox message, await reply */ "!msgid=0x%x reply timed out",
case 0:
/* signal received */ "operation interrupted by signal");
* If link is set for this entry, check to see if * the linked entry has been replied to. If not, * Currently, this is only used for ABORT_TEST functionality, * wherein a check is made for the TESTBOARD reply when * the ABORT_TEST reply is received. * If the reply to the linked entry hasn't been * received, clear the existing link->f_error, "!link msgid=0x%x reply timed out",
* If framework failure is due to signal, return "no error" /* need to initialize the mailbox */ "!reinitializing DR mailbox");
* If framework failure incoming is encountered on * the MBOX_INIT [timeout on SMS reply], the error * type must be changed before returning to caller. * This is to prevent drmach_board_connect() and * drmach_board_disconnect() from marking boards * UNUSABLE based on MBOX_INIT failures. "!Changed mbox incoming to outgoing" /* setup outgoing mailbox header */ "Unknown outgoing message type 0x%x",
msgtype);
* For DRMSG_TESTBOARD attempts which have timed out, or * been aborted due to a signal received after mboxsc_putmsg() * has succeeded in sending the message, a DRMSG_ABORT_TEST /* register the outgoing mailbox */ /* setup the mboxsc_putmsg timeout value */ DRMACH_PR(
"putmsg range is 0x%lx - 0x%lx value" /* register the incoming mailbox */ /* initialize mutex for mailbox globals */ /* initialize mutex for mailbox re-init */ /* initialize mailbox message list elements */ /* start mailbox sendmsg thread */ /* start mailbox getmsg thread */ "drmach_mbox_fini: waiting for mbox threads...");
"drmach_mbox_fini: mbox threads done.");
/* de-register the outgoing mailbox */ /* de-register the incoming mailbox */ case 0:
case 1:
case 2:
case 3:
/* cpu/wci devices */ case 0x1e:
/* slot 0 axq registers */ case 8:
case 9:
/* cpu devices */ case 0x1c:
case 0x1d:
/* schizo/wci devices */ case 0x1f:
/* slot 1 axq registers */ ASSERT(0);
/* catch in debug kernels */ * For Starcat, we must be children of the root devinfo node * Only children of the root devinfo node need to be * of tree operations. This corresponds to the node types * listed in the drmach_name2type array. /* Not of interest to us */ /* portid translated to an invalid board number */ " invalid property value, %s=%u",
DRMACH_PR(
"gdcd size=0x%x align=0x%x PA=0x%x\n",
DRMACH_PR(
"drmach size=0x%x PA=0x%lx VA=0x%p\n",
* Walk immediate children of devinfo root node and hold * all devinfo branches of interest. * To avoid a circular patch dependency between DR and AXQ, the AXQ * rev introducing the axq_iopause_*_all interfaces should not regress * when installed without the DR rev using those interfaces. The default * setting the following axq flag to zero, axq will not enable iopause * during suspend/resume, instead DR will call the axq_iopause_*_all * interfaces during drmach_copy_rename. * Walk immediate children of the root devinfo node * releasing holds acquired on branches in drmach_init() /* get register address, read madr value */ /* fetch mc's bank madr register value */ /* encode new base pa into madr */ /* memory is always in slot 0 */ /* look up slot 1 board on same expander */ bp =
id;
/* bp will be NULL if board not found */ /* look up should never be out of bounds */ /* nothing to do when board is not found or has no devices */ * Skip all non-Schizo IO devices (only IO nodes * that are Schizo devices have non-zero scsr_pa). * Filter out "other" leaf to avoid writing to the * For Panther MCs, append the MC idle reg address and drmach_mem_t pointer. * The latter is returned when drmach_rename fails to idle a Panther MC and * is used to identify the MC for error reporting. /* only slot 0 has memory */ /* verify supplied buffer space is adequate */ /* addr for all possible MC banks */ /* list section terminator */ /* addr/id tuple for local Panther MC idle reg */ /* list section terminator */ /* addr/id tuple for 2 boards with 4 Panther MC idle regs */ /* list section terminator */ /* addr/val tuple for 1 proc with 4 MC banks */ /* list section terminator */ /* addr/val tuple for 2 boards w/ 2 schizos each */ /* addr/val tuple for 2 boards w/ 16 MC banks each */ /* list section terminator */ /* addr/val tuple for 18 AXQs w/ two slots each */ /* list section terminator */ /* copy bank list to rename script */ /* list section terminator */ * Write idle script for MC on this processor. A script will be * produced only if this is a Panther processor on the source or /* list section terminator */ * Write idle script for all other MCs on source and target /* list section terminator */ * Step 1: Write source base address to target MC * Step 2: Now rewrite target reg with present bit on. /* exchange base pa. include slice offset in new target base pa */ DRMACH_PR(
"preparing MC MADR rename script (master is CPU%d):\n",
* Write rename script for MC on this processor. A script will * be produced only if this processor is on the source or target /* list section terminator */ * Write rename script for all other MCs on source and target /* list section terminator */ DRMACH_PR(
"preparing AXQ CASM rename script (EXP%d <> EXP%d):\n",
/* list section & final terminator */ DRMACH_PR(
"local Panther MC idle reg (via ASI 0x4a):\n");
DRMACH_PR(
"addr=0x%lx, mp=0x%lx\n", *q, *(q +
1));
DRMACH_PR(
"non-local Panther MC idle reg (via ASI 0x15):\n");
DRMACH_PR(
"addr=0x%lx, mp=0x%lx\n", *q, *(q +
1));
DRMACH_PR(
"MC reprogramming script (via ASI 0x72):\n");
uint64_t r = *q++;
/* register address */ uint64_t v = *q++;
/* new register value */ /* verify final terminator is present */ DRMACH_PR(
"copy-rename script 0x%p, len %d\n",
/* get starting physical address of target memory */ /* calculate slice offset mask from slice size */ /* calculate source and target base pa */ /* adjust copy memlist addresses to be relative to copy base pa */ DRMACH_PR(
"source copy span: base pa 0x%lx, end pa 0x%lx\n",
DRMACH_PR(
"target copy span: base pa 0x%lx, end pa 0x%lx\n",
DRMACH_PR(
"copy memlist (relative to copy base pa):\n");
DRMACH_PR(
"current source base pa 0x%lx, size 0x%lx\n",
DRMACH_PR(
"current target base pa 0x%lx, size 0x%lx\n",
/* Map in appropriate cpu sram page */ /* Make sure the rename routine will fit */ /* copy text. standard bcopy not designed to work in nc space */ /* zero remainder. standard bzero not designed to work in nc space */ /* disable and flush CDC */ /* mark both memory units busy */ for (i = 0; i <
NCPU; i++) {
/* update casm shadow for target and source board */ * Make a good-faith effort to notify the SC about the copy-rename, but * will duplicate the update. DRMACH_PR(
"waited %d out of %d tries for drmach_rename_wait on %d cpus",
* Prevent slot1 IO from accessing Safari memory bus. for (i = 0; i <
NCPU; i++)
/* steal the line back, preserve data */ /* disable CE reporting */ /* disable interrupts (paranoia) */ * Execute copy-rename under on_trap to protect against a panic due * to an uncorrectable error. Instead, DR will abort the copy-rename * operation and rely on the OS to do the error reporting. * In general, trap handling on any cpu once the copy begins * can result in an inconsistent memory image on the target. /* copy 32 bytes at src_pa to dst_pa */ /* increment by 32 bytes */ /* decrement by 32 bytes */ * For cheetah, we need to grab the iocage lock since iocage * memory is used for e$ flush. * NOTE: This code block is dangerous at this point in the * copy-rename operation. It modifies memory after the copy * has taken place which means that any persistent state will * be abandoned after the rename operation. The code is also * performing thread synchronization at a time when all but * one processors are paused. This is a potential deadlock * This code block must be moved to drmach_copy_rename_init. * bcopy32_il is implemented as a series of ldxa/stxa via * ASI_MEM instructions. Following the copy loop, the E$ * of the master (this) processor will have lines in state * O that correspond to lines of home memory in state gI. * An E$ flush is necessary to commit these lines before * proceeding with the rename operation. * Flushing the E$ will automatically flush the W$, but * the D$ and I$ must be flushed separately and explicitly. * Each line of home memory is now in state gM, except in * the case of a cheetah processor when the E$ flush area * is included within the copied region. In such a case, * the lines of home memory for the upper half of the * flush area are in state gS. * Each line of target memory is in state gM. * Each line of this processor's E$ is in state I, except * those of a cheetah processor. All lines of a cheetah * processor's E$ are in state S and correspond to the lines * in upper half of the E$ flush area. * It is vital at this point that none of the lines in the * home or target memories are in state gI and that none * of the lines in this processor's E$ are in state O or Os. * A single instance of such a condition will cause loss of * coherency following the rename operation. * Rename operation complete. The physical address space * of the home and target memories have been swapped, the * routing data in the respective CASM entries have been * swapped, and LPA settings in the processor and schizo * devices have been reprogrammed accordingly. * In the case of a cheetah processor, the E$ remains * populated with lines in state S that correspond to the * lines in the former home memory. Now that the physical * addresses have been swapped, these E$ lines correspond * to lines in the new home memory which are in state gM. * This combination is invalid. An additional E$ flush is * necessary to restore coherency. The E$ flush will cause * the lines of the new home memory for the flush region * to transition from state gM to gS. The former home memory * remains unmodified. This additional E$ flush has no effect * on a cheetah+ processor. * The D$ and I$ must be flushed to ensure that coherency is * maintained. Any line in a cache that is in the valid * state has its corresponding line of the new home memory * in the gM state. This is an invalid condition. When the * flushes are complete the cache line states will be * resynchronized with those in the new home memory. /* enable CE reporting */ /* pci nodes are expected to have regs */ "Device Node 0x%x: property %s",
"Device Node 0x%x: property %s",
* Fix up unit number so that Leaf A has a lower unit number /* reassemble 64-bit base address */ * Determine PRD port indices based on slot location. * This Safari port passed POST and represents a * cpu, so check the implementation. DRMACH_PR(
"drmach_board_non_panther_cpus: exp=%d, slot=%d, " * Build the casm info portion of the CLAIM message. * if mailbox timeout or unrecoverable error from SC, * board cannot be touched. Mark the status as * Read CPU SRAM DR buffer offset from GDCD. * Read board LPA setting from GDCD. * XXX Until the Solaris large pages support heterogeneous cpu * domains, DR needs to prevent the addition of non-Panther cpus * to an all-Panther domain with large pages enabled. "UltraSPARC-IV+ board into an all UltraSPARC-IV+ domain");
/* do saf configurator stuff */ * Build the casm info portion of the UNCLAIM message. * we clear the connected flag just in case it would have * been set by a concurrent drmach_board_status() thread * before the UNCLAIM completed. * Now that the board has been successfully attached, obtain * platform-specific DIMM serial id information for the board. static char *
axq_name =
"address-extender-queue";
/* invalidate cached casm value */ /* invalidate cached axq info if for same exp */ /* search for an attached slot0 axq instance */ DRMACH_PR(
"drmach_slice_table_update: failed to " DRMACH_PR(
"using AXQ casm %d.%d for slot%d.%d\n",
* find a slice that routes to expander e. If no match * is found, drmach_slice_table[e] will remain invalid. * The CASM is a routing table indexed by slice number. * Each element in the table contains permission bits, * a destination expander number and a valid bit. The * valid bit must true for the element to be meaningful. * Bits 0..4 expander number * NOTE: the for loop is really enumerating the range of slices, * which is ALWAYS equal to the range of expanders. Hence, * AXQ_MAX_EXP is okay to use in this loop. if ((
casm &
0x20) && (
casm &
0x1f) == e)
* Get base and bound PAs for slot 1 board lpa programming * If a cpu/mem board is present in the same expander, use slice * information corresponding to the CASM. Otherwise, set base and * Reprogram slot 1 lpa's as required. * The purpose of this routine is maintain the LPA settings of the devices * in slot 1. To date we know Schizo and Cheetah are the only devices that * require this attention. The LPA setting must match the slice field in the * CASM element for the local expander. This field is guaranteed to be * programmed in accordance with the cacheable address space on the slot 0 * board of the local expander. If no memory is present on the slot 0 board, * there is no cacheable address space and, hence, the CASM slice field will * be zero or its valid bit will be false (or both). DRMACH_PR(
"drmach...lpa_set: slot1=%d not present",
/* nothing to do when board is not found or has no devices */ DRMACH_PR(
"drmach...lpa_set: slot1=%d not present",
DRMACH_PR(
"drmach_...lpa_set: bnum=%d base=0x%lx bound=0x%lx\n",
* Skip all non-Schizo IO devices (only IO nodes * that are Schizo devices have non-zero scsr_pa). * Filter out "other" leaf to avoid writing to the DRMACH_PR(
"drmach...lpa_set: old scsr=0x%lx\n",
DRMACH_PR(
"drmach...lpa_set: new scsr=0x%lx\n",
* Check for unconfigured or powered-off * MCPUs. If CPU_READY flag is clear, the * MCPU cannot be xcalled. * for cheetah, we need to clear iocage * memory since it will be used for e$ flush * drmach_xt_mb[*] format for drmach_set_lpa * drmach_set_lpa derives processor CBASE and * CBND from bits 6 and 0:4 of drmach_xt_mb. * If bit 6 is set, then CBASE = CBND = 0. * Otherwise, CBASE = slice number; * CBND = slice number + 1. * No action is taken if bit 7 is zero. * for cheetah, we need to clear iocage * memory since it was used for e$ flush * in performed drmach_set_lpa. * Return the number of connected Panther boards in the domain. * Build the casm info portion of the UNCLAIM message. * This must be done prior to calling for saf configurator * deprobe, to ensure that the associated axq instance * If disconnecting slot 0 board, update the casm slice table * info now, for use by drmach_slot1_lpa_set() * Update LPA information for slot1 board /* disable and flush CDC */ * call saf configurator for deprobe * It's done now before sending an UNCLAIM message because * IKP will probe boards it doesn't know about <present at boot> * prior to unprobing them. If this happens after sending the * UNCLAIM, it will cause a dstop for domain transgression error. * If disconnecting a board from a Panther domain, wait a fixed- * time delay for pending Safari transactions to complete on the * disconnecting board's processors. The bus sync list read used * in drmach_shutdown_asm to synchronize with outstanding Safari * transactions assumes no read-bypass-write mode for all memory * controllers. Since Panther supports read-bypass-write, a * delay is used that is slightly larger than the maximum Safari DRMACH_PR(
"delayed %ld ticks (%ld secs) before disconnecting " * if mailbox timeout or unrecoverable error from SC, * board cannot be touched. Mark the status as DRMACH_PR(
"calling sc_probe_board: bnum=%d\n",
"sc_probe_board failed for bnum=%d",
* Now that the board has been successfully detached, * discard platform-specific DIMM serial id information * Get the device_type property to see if we should * continue processing this node. * If the device is a CPU without a 'portid' property, * it is a CMP core. For such cases, the parent node * This is a helper function to determine if a given * node should be considered for a dr operation according * to predefined dr type nodes and the node's name. * Formal Parameter : The name of a device node. * Return Value: -1, name does not map to a valid dr type. * A value greater or equal to 0, name is a valid dr type. * Determine how many possible types are currently supported /* Determine if the node's name correspond to a predefined type. */ /* The node is an allowed type for dr. */ * If the name of the node does not map to any of the * types in the array drmach_name2type then the node is not of * if the node does not have a portid property, then * by that information alone it is known that drmach * is not interested in it. /* The node must have a name */ * Ignore devices whose portid do not map to this board, * or that their name property is not mapped to a valid * Create a device data structure from this node data. * The call may yield nothing if the node is not of interest * drmach_device_new examined the node we passed in * and determined that it was either one not of * interest to drmach or the PIM dr layer. DRMACH_PR(
"Unknown test status=0x%x from SC\n",
* If the board is an I/O or MAXCAT board, setup I/O cage for * testing. Slot 1 indicates I/O or MAXCAT board. /* examine test status */ * If I/O cage test was performed, check for availability of the * cpu used. If cpu has been returned, it's OK to proceed with * reconfiguring it for use. * Check the cpu_recovered flag in the testboard reply, or * if the testboard request message was not sent to SMS due * to an mboxsc_putmsg() failure, it's OK to recover the * cpu since hpost hasn't touched it. "after I/O cage test: cpu_recovered=%d, " * If the node does not have a portid property, * it represents a CMP device. For a CMP, the reg * property of the parent holds the information of /* reassemble 64-bit base address */ * A return value of 1 indicates success and 0 indicates a failure * Confirm cpu was in ready set when xc was issued. * This is done by verifying rv which is * set to 0x1 when xc_one is successful. * If a CPU does not have a portid property, it must * be a CMP device with a cpuid property. /* Starcat CMP core id is bit 2 of the cpuid */ * Init the board cpu type. Assumes all board cpus are the same type. * determine if the domain uses Cheetah procs * Initialize TTE for mapping CPU SRAM STARDRB buffer. * The STARDRB buffer (16KB on Cheetah+ boards, 32KB on * pair. Each cpu uses 8KB according to the following layout: DRMACH_PR(
"drmach_cpu_new: cpuid=%d, coreid=%d, stardrb_offset=0x%lx, " * NOTE: restart_other_cpu pauses cpus during the * slave cpu start. This helps to quiesce the * bus traffic a bit which makes the tick sync * routine in the prom more robust. DRMACH_PR(
"drmach_cpu_start: cannot read board info for " * drmach_xt_mb[*] format for drmach_set_lpa * bit 6 set null LPA (overrides bits 0:4) * drmach_set_lpa derives processor CBASE and CBND * from bits 6 and 0:4 of drmach_xt_mb. If bit 6 is * set, then CBASE = CBND = 0. Otherwise, CBASE = slice * number; CBND = slice number + 1. * No action is taken if bit 7 is zero. "waited %d out of %d tries for drmach_set_lpa on cpu%d",
* A detaching CPU is xcalled with an xtrap to drmach_cpu_stop_self() after * it has been offlined. The function of this routine is to get the cpu * spinning in a safe place. The requirement is that the system will not * reference anything on the detaching board (memory and i/o is detached * elsewhere) and that the CPU not reference anything on any other board * in the system. This isolation is required during and after the writes * to the domain masks to remove the board from the domain. * To accomplish this isolation the following is done: * 1) Create a locked mapping to the STARDRB data buffer located * in this cpu's sram. There is one TTE per cpu, initialized in * drmach_cpu_new(). The cpuid is used to select which TTE to use. * Each Safari port pair shares the CPU SRAM on a Serengeti CPU/MEM * board. The STARDRB buffer is 16KB on Cheetah+ boards, 32KB on Jaguar * boards. Each STARDRB buffer is logically divided by DR into one * 8KB page per cpu (or Jaguar core). * 2) Copy the target function (drmach_shutdown_asm) into buffer. * 3) Jump to function now in the cpu sram. * 3.1) Flush its Ecache (displacement). * 3.2) Flush its Dcache with HW mechanism. * 3.3) Flush its Icache with HW mechanism. * 3.4) Flush all valid and _unlocked_ D-TLB and I-TLB entries. * 3.6) Clear xt_mb to signal completion. Note: cache line is * recovered by drmach_cpu_poweroff(). * 4) Jump into an infinite loop. /* copy text. standard bcopy not designed to work in nc space */ /* zero to assist debug */ /* a parking spot for the stack pointer */ /* call copy of drmach_shutdown_asm */ * Flush this cpu's ecache, then ensure all outstanding safari * transactions have retired. * safari IDs end in 0x1C. * All PCI B-Leafs are at configspace 0x70.0000. * Verify if the dip is an instance of MAN 'eri'. * Verify if the parent is schizo(xmits)0 and pci B leaf. * This RIO could be on XMITS, so get the dip to * Finally make sure it is the MAN eri. * The network function of the RIO ASIC will always be * device 3 and function 1 ("network@3,1"). /* ignore devices that are not on this board */ /* walk device tree to find iosram instance for the board */ DRMACH_PR(
"drmach_io_pre_release: bnum=%d iosram=%d eri=0x%p\n",
* Release hold acquired in drmach_board_find_io_insts() /* call for tunnel switch */ * Walk device tree to find rio dip for the board * Since we are not interested in iosram instance here, * initialize it to 0, so that the walk terminates as * soon as eri dip is found. * Root node doesn't have to be held in any way. DRMACH_PR(
"drmach_io_unrelease: bnum=%d eri=0x%p\n",
* Release hold acquired in * drmach_board_find_io_insts() * Walk device tree to find rio dip for the board * Since we are not interested in iosram instance here, * initialize it to 0, so that the walk terminates as * soon as eri dip is found. * Root node doesn't have to be held in any way. DRMACH_PR(
"drmach_io_release: bnum=%d eri=0x%p\n",
* Release hold acquired in * drmach_board_find_io_insts() * Always called after drmach_unconfigure() which on Starcat * unconfigures the branch but doesn't remove it so the * We held the branch rooted at dip earlier, so at a minimum the * root i.e. dip must be present in the device tree. * Walk device tree to find rio dip for the board * Since we are not interested in iosram instance here, * initialize it to 0, so that the walk terminates as * soon as eri dip is found. * Root node doesn't have to be held in any way. DRMACH_PR(
"drmach_io_post_attach: bnum=%d eri=0x%p\n",
* Release hold acquired in drmach_board_find_io_insts() * Hardware registers are organized into consecutively * addressed registers. The reg property's hi and lo fields * together describe the base address of the register set for * this memory-controller. Register descriptions and offsets * (from the base address) are as follows: * Description Offset Size (bytes) * Memory Timing Control Register I 0x00 8 * Memory Timing Control Register II 0x08 8 * Memory Address Decoding Register I 0x10 8 * Memory Address Decoding Register II 0x18 8 * Memory Address Decoding Register III 0x20 8 * Memory Address Decoding Register IV 0x28 8 * Memory Address Control Register 0x30 8 * Memory Timing Control Register III 0x38 8 * Memory Timing Control Register IV 0x40 8 * Memory Timing Control Register V 0x48 8 (Jaguar, Panther only) * EMU Activity Status Register 0x50 8 (Panther only) * Only the Memory Address Decoding Register and EMU Activity Status * Register addresses are needed for DRMACH. * If none of the banks had their valid bit set, that means * post did not configure this MC to participate in the * domain. So, pretend this node does not exist by returning /* drmach_mem_dispose frees board mem list */ * Only one mem unit per board is exposed to the * PIM layer. The first mem unit encountered during * tree walk is used to represent all mem units on /* start list of mem units on this board */ * force unum to zero since this is the only mem unit * that will be visible to the PIM layer. * board memory size kept in this mem unit only /* drmach_mem_dispose frees board mem list */ * allow this instance (the first encountered on this board) * to be visible to the PIM layer. /* hide this mem instance behind the first. */ * hide this instance from the caller. * See drmach_board_find_devices_cb() for details. " to kernel cage",
size >>
20);
/* catch this in debug kernels */ "unexpected kcage_range_delete_post_mem_del" #
define MB(
mb) ((
mb) *
1048576ull)
/* prime the result with a default value */ /* get register value, extract uk and normalize */ for (i = 0; i <
len; i++)
* remember largest segment size, * uk not in table, punt using * entire slice size. no longer any * reason to check other banks. /* should not happen, but ... */ * The list is zero terminated. * Offset the pa by a doubleword * to avoid confusing a pa value of * of zero with the terminator. /* try to choose a proc on the target board */ /* otherwise, this proc, wherever it is */ /* get starting physical address of target memory */ /* round down to slice boundary */ /* stop at first span that is in slice */ uprintf(
"showlpa %s::%s portid %d, base pa %lx, bound pa %lx\n",
/* do saf configurator stuff */ /* copy 32 bytes at src_pa to dst_pa */ /* increment by 32 bytes */ /* decrement by 32 bytes */ * Starcat DR passthrus are for debugging purposes only. /* the following line must always be last */ * Since CPU nodes are not configured, it is * necessary to skip the unconfigure step as * drmach_unconfigure() is always called on a configured branch. * So the root of the branch was held earlier and must exist. DRMACH_PR(
"drmach_unconfigure: unconfiguring DDI branch");
/* The node must have a name */ * NOTE: FORCE flag is no longer needed under devfs * If non-NULL, fdip is returned held and must be * If we were unconfiguring an IO board, a call was * made to man_dr_detach. We now need to call * man_dr_attach to regain man use of the eri. * Walk device tree to find rio dip for * Since we are not interested in iosram * instance here, initialize it to 0, so * that the walk terminates as soon as * Root node doesn't have to be held in * Release hold acquired in * drmach_board_find_io_insts() * drmach interfaces to legacy Starfire platmod logic * linkage via runtime symbol look up, called from plat_cpu_power* * Start up a cpu. It is possible that we're attempting to restart * the cpu after an UNCONFIGURE in which case the cpu will be * spinning in its cache. So, all we have to do is wakeup him up. * Under normal circumstances the cpu will be coming from a previous * CONNECT and thus will be spinning in OBP. In both cases, the * startup sequence is the same. * for cheetah, we need to grab the iocage lock since iocage * memory is used for e$ flush. * Set affinity to ensure consistent reading and writing of * drmach_xt_mb[cpuid] by one "master" CPU directing * the shutdown of the target CPU. * Capture all CPUs (except for detaching proc) to prevent * crosscalls to the detaching proc until it has cleared its * The CPUs remain paused and the prom_mutex is known to be free. * This prevents blocking when doing prom IEEE-1275 calls at a * Quiesce interrupts on the target CPU. We do this by setting * the CPU 'not ready'- (i.e. removing the CPU from cpu_ready_set) to * prevent it from receiving cross calls and cross traps. * This prevents the processor from receiving any new soft interrupts. /* setup xt_mb, will be cleared by drmach_shutdown_asm when ready */ "drmach_cpu_shutdown_self on cpu%d",
* Do this here instead of drmach_cpu_shutdown_self() to "iocage scrub failed, drmach_bc_bzero returned %d\n",
rv);
"iocage scrub failed, drmach_bc_bzero rv=%d\n",
* HPOST wants the address of the cage to be 64 megabyte-aligned * The size of the cage is also in megabyte units. static char *
fn =
"drmach_iocage_cpu_acquire";
* There is a known HW bug where a Jaguar CPU in Safari port 0 (SBX/P0) * can fail to receive an XIR. To workaround this issue until a hardware * fix is implemented, we will exclude the selection of these CPUs. * Once a fix is implemented in hardware, this code should be updated * to allow Jaguar CPUs that have the fix to be used. However, support * must be retained to skip revisions that do not have this fix. DRMACH_PR(
"%s: excluding CPU id %d: port 0 on jaguar",
"during I/O cage test selection",
cpuid);
"no-intr during I/O cage test selection",
cpuid);
DRMACH_PR(
"%s: cpu_unconfigure failed for CPU id %d",
fn,
"during I/O cage test selection",
cpuid);
"during I/O cage test selection",
"poweron" :
"online",
cpuid);
"no-intr during I/O cage test selection",
cpuid);
* Attempt to acquire all the CPU devices passed in. It is * assumed that all the devices in the list are the cores of * a single CMP device. Non CMP devices can be handled as a * single core CMP by passing in a one element list. * Success is only returned if *all* the devices in the list * can be acquired. In the failure case, none of the devices * in the list will be held as acquired. * Walk the list of CPU devices (cores of a CMP) * and attempt to acquire them. Bail out if an /* check for the end of the list */ * Make a best effort attempt to return any cores * that were already acquired before the error was for (i = 0; i <
curr; i++) {
static char *
fn =
"drmach_iocage_cpu_return";
"after I/O cage test",
cpuid);
* The component was never set to unconfigured during the IO * cage test, so we need to leave marked as busy to prevent * further DR operations involving this component. "poweron" :
"online",
cpuid);
* drmach_iocage_cpu_acquire will accept cpus in state P_ONLINE or * P_NOINTR. Need to return to previous user-visible state. "no-intr after I/O cage test",
cpuid);
/* An AXQ restriction disqualifies MCPU's as candidates. */ * Walk the device list of this board. /* only interested in CPU devices */ * The following code assumes two properties * 1. All cores of a CMP are grouped together * 2. There will only be a maximum of two cores * If either of these two properties change, * this code will have to be revisited. * Get the next device. It may or may not be used. * The second device is only interesting for * this pass if it has the same portid as the * first device. This implies that both are * Attempt to acquire all cores of the CMP. * Check if the search for the second core was * successful. If not, the next iteration should * Setup an iocage by acquiring a cpu and memory. * Table of saved state for paused slot1 devices. DRMACH_PR(
"drmach_is_slot1_pause_axq: no reg prop for " * Allocate an entry in the slot1_paused state table. * XXX This dip should really be held (via ndi_hold_devi()) * before saving it in the axq pause structure. However that * would prevent DR as the pause data structures persist until * the next suspend. drmach code should be modified to free the * the slot 1 pause data structures for a boardset when its * slot 1 board is DRed out. The dip can then be released via * ndi_rele_devi() when the pause data structure is freed * allowing DR to proceed. Until this change is made, drmach * code should be careful about dereferencing the saved dip * as it may no longer exist. * Tree walk callback routine. If dip represents a Schizo PCI leaf, * fill in the appropriate info in the slot1_paused state table. DRMACH_PR(
"drmach_find_slot1_io: no reg prop for pci " * XXX This dip should really be held (via ndi_hold_devi()) * before saving it in the pci pause structure. However that * would prevent DR as the pause data structures persist until * the next suspend. drmach code should be modified to free the * the slot 1 pause data structures for a boardset when its * slot 1 board is DRed out. The dip can then be released via * ndi_rele_devi() when the pause data structure is freed * allowing DR to proceed. Until this change is made, drmach * code should be careful about dereferencing the saved dip as * it may no longer exist. DRMACH_PR(
"drmach_find_slot1_io: name=%s, portid=0x%x, dip=%p\n",
* Root node doesn't have to be held * Save the interrupt mapping registers for each non-idle interrupt * represented by the bit pairs in the saved interrupt state * diagnostic registers for this PCI leaf. * 1st pass allocates, 2nd pass populates. for (i = 0; i <
2; i++) {
* Xmits Interrupt Number Offset(ino) Assignments * 00-17 PCI Slot Interrupts * Xmits Interrupt Number Offset(ino) Assignments * 30-37 Internal interrupts * OBIO and internal schizo interrupts * Each PCI leaf has a set of mapping registers for all * possible interrupt sources except the NewLink interrupts. * Select l2_io_queue counter by writing L2_IO_Q mux * input to bits 0-6 of perf cntr select reg. DRMACH_PR(
"drmach_s1p_axq_update: axq #%d pic_l2_io_q[%d]=%d\n",
* Called post-suspend and pre-resume to snapshot the suspend state * of slot1 AXQs and Schizos. * Starcat hPCI Schizo devices. * The name field is overloaded. NULL means the slot (interrupt concentrator * bus) is not used. intr_mask is a bit mask representing the 4 possible * interrupts per slot, on if valid (rio does not use interrupt lines 0, 1). /* Schizo 0 */ /* Schizo 1 */ {{
"C3V0",
0xf}, {
"C3V1",
0xf}},
/* slot 0 */ {{
"C5V0",
0xf}, {
"C5V1",
0xf}},
/* slot 1 */ {{
"rio",
0xc}, {
NULL,
0x0}},
/* slot 2 */ {{
NULL,
0x0}, {
NULL,
0x0}},
/* slot 3 */ {{
"sbbc",
0xf}, {
NULL,
0x0}},
/* slot 4 */ {{
NULL,
0x0}, {
NULL,
0x0}},
/* slot 5 */ {{
NULL,
0x0}, {
NULL,
0x0}},
/* slot 6 */ * See Schizo Specification, Revision 51 (May 23, 2001), Section 22.4.4 * "Interrupt Registers", Table 22-69, page 306. case (
0x0):
return (
"Uncorrectable ECC error");
case (
0x1):
return (
"Correctable ECC error");
case (
0x2):
return (
"PCI Bus A Error");
case (
0x3):
return (
"PCI Bus B Error");
case (
0x4):
return (
"Safari Bus Error");
default:
return (
"Reserved");
prom_printf(
"IO%d/P%d PCI slot interrupt: ino=0x%x, source device=%s, " * Log interrupt source device info for all valid, pending interrupts * on each Schizo PCI leaf. Called if Schizo has logged a Safari bus * error in the error ctrl reg. * Check the saved interrupt mapping registers. If interrupt is valid, * map the ino to the Schizo source device and check that the pci * slot and interrupt line are valid. }
else if (
ino <=
0x2f) {
}
else if (
ino <=
0x37) {
"interrupt: ino=0x%x (%s)\n",
"interrupt: ino=0x%x\n",
exp,
"exp=%d, schizo=%d, pci_leaf=%c, " "ino=0x%x, intr_map_reg=0x%lx\n",
* See Schizo Specification, Revision 51 (May 23, 2001), Section 22.2.4 * "Safari Error Control/Log Registers", Table 22-11, page 248. * Check for possible error indicators prior to resuming the * AXQ driver, which will de-assert slot1 AXQ_DOMCTRL_PAUSE. * Check for logged schizo bus error and pending interrupts. "attempt detected during " "post suspend" :
"pre resume");
* Check for changes in axq l2_io_q performance counters (2nd pass only) "detected on IO%d during copy-rename: " "AXQ l2_io_q performance counter " {
"address-extender-queue",
NULL },
{
NULL,
NULL },
/* terminator -- required */ /* place new node behind head node on ring list */ /* start search with mostly likely node */ " disposed sr node for dip %p",
dip);
DRMACH_PR(
"drmach_sr_delete: still searching\n");
/* every dip should be found during resume */ DRMACH_PR(
"ERROR: drmach_sr_delete: can't find dip %p",
dip);
/* schedule init for next suspend */ suspend ?
"suspending" :
"resuming",
* The ordering array declares the strict sequence in which * the named drivers are to suspended. Each element in * the array may have a double-linked ring list of driver * instances (dip) in the order in which they were presented * to drmach_verify_sr. If present, walk the list in the * forward direction to suspend each instance. op -=
1;
/* point at terminating element */ * walk ordering array and rings backwards to resume dips * in reverse order in which they were suspended * Return value: 0 success, non-zero failure. DRMACH_PR(
"drmach_log_sysevent: %s %s, flag: %d, verbose: %d\n",
* Log the event but do not sleep waiting for its * delivery. This provides insulation from syseventd. "drmach_log_sysevent failed (rv %d) for %s %s\n",
* Only the valid entries are modified, so the array should be zeroed out * Only the valid entries are modified, so the array should be zeroed out ASSERT(
rv == 0);
/* should never be out of bounds */ * Do not allow physical address range modification if either board on this * expander has processors in NULL LPA mode (CBASE=CBND=NULL). * A side effect of NULL proc LPA mode in Starcat SSM is that local reads will * install the cache line as owned/dirty as a result of the RTSR transaction. * See section 5.2.3 of the Safari spec. All processors will read the bus sync * list before the rename after flushing local caches. When copy-rename * requires changing the physical address ranges (i.e. smaller memory target), * the bus sync list contains physical addresses that will not exist after the * rename. If these cache lines are owned due to a RTSR, a system error can * occur following the rename when these cache lines are evicted and a writeback * Incoming parameter represents either the copy-rename source or a candidate * target memory board. On Starcat, only slot0 boards may have memory. * This is reason enough to fail the request, no need * to check the device list for cpus. * Check for MCPU board on the same expander. * The board flag DRMACH_NULL_PROC_LPA can be set for all board * types, as it is derived at from the POST gdcd board flag * L1SSFLG_THIS_L1_NULL_PROC_LPA, which can be set (and should be * ignored) for boards with no processors. Since NULL proc LPA * applies only to processors, we walk the devices array to detect * Fail MCPU in NULL LPA mode.