mach_cpu_states.c revision d07db889a707792afa6ee57f6361c13f1f3f471f
0N/A * The contents of this file are subject to the terms of the 0N/A * Common Development and Distribution License (the "License"). 0N/A * You may not use this file except in compliance with the License. 0N/A * See the License for the specific language governing permissions 0N/A * and limitations under the License. 0N/A * When distributing Covered Code, include this CDDL HEADER in each 0N/A * If applicable, add the following below this CDDL HEADER, with the 0N/A * fields enclosed by brackets "[]" replaced with your own identifying 0N/A * information: Portions Copyright [yyyy] [name of copyright owner] 0N/A * Copyright (c) 2000, 2010, Oracle and/or its affiliates. All rights reserved. * hvdump_buf_va is a pointer to the currently-configured hvdump_buf. * A value of NULL indicates that this area is not configured. * hvdump_buf_sz is tunable but will be clamped to HVDUMP_SIZE_MAX. * For xt_sync synchronization. * We keep our own copies, used for cache flushing, because we can be called * In an LDoms system we do not save the user's boot args in NVRAM * as is done on legacy systems. Instead, we format and send a * 'reboot-command' variable to the variable service. The contents * of the variable are retrieved by OBP and used verbatim for * invoke_cb is set to true when we are in a normal shutdown sequence * (interrupts are not blocked, the system is not panicking or being * suspended). In that case, we can use any method to store the boot * command. Otherwise storing the boot command can not be done using * a domain service because it can not be safely used in that context. * Save the reboot-command with HV, if reboot data group is * negotiated. Else save the reboot-command via vars-config domain "use on reboot with HV: error = 0x%lx",
status);
* Machine dependent code to reboot. * "bootstr", when non-null, points to a string to be used as the * argument string when rebooting. * "invoke_cb" is a boolean. It is set to true when mdboot() can safely * invoke CB_CL_MDBOOT callbacks before shutting the system down, i.e. when * we are in a normal shutdown sequence (interrupts are not blocked, the * system is not panic'ing or being suspended). * XXX - rconsvp is set to NULL to ensure that output messages * are sent to the underlying "hardware" device using the * monitor's printf routine since we are in the process of * either rebooting or halting the machine. * LDoms: By storing a no-op command * in the 'reboot-command' variable we cause OBP * to ignore the setting of 'auto-boot?' after * it completes the reset. This causes the system * to stop at the ok prompt. "mdboot: invalid function %d",
fcn);
* If LDoms is running, we must save the boot string * before we enter restricted mode. This is possible * only if we are not being called from panic. * At a high interrupt level we can't: * 1) bring up the console * 2) wait for pending interrupts prior to redistribution /* make sure there are no more changes to the device tree */ * Clear any unresolved UEs from memory. * stop other cpus which also raise our priority. since there is only * one active cpu after this, and our priority will be too high * for us to be preempted, we're essentially single threaded * try and reset leaf devices. reset_leaves() should only * be called when there are no other threads that could be /* mdpreboot - may be called prior to mdboot while root fs still mounted */ * Halt the machine and then reboot with the device * and arguments specified in bootstr. * For platforms that use CPU signatures, we * need to set the signature block to OS and * the state to exiting for all the processors. * We use the x-trap mechanism and idle_stop_xcall() to stop the other CPUs. * Once in panic_idle() they raise spl, record their location, and spin. * Force the other CPUs to trap into panic_idle(), and then remove them * from the cpu_ready_set so they will no longer receive cross-calls. for (i = 0; i <
NCPU; i++) {
printf(
"panic: failed to stop cpu%d\n", i);
* Platform callback following each entry to panicsys(). If we've panicked at * level 14, we examine t_panic_trap to see if a fatal trap occurred. If so, * we disable further %tick_cmpr interrupts. If not, an explicit call to panic * was made and so we re-enqueue an interrupt request structure to allow * further level 14 interrupts to be processed once we lower PIL. This allows * us to handle panics from the deadman() CY_HIGH_LEVEL cyclic. /* there are no possible error codes for this hcall */ * Clear SOFTINT<14>, SOFTINT<0> (TICK_INT) * and SOFTINT<16> (STICK_INT) to indicate * that the current level 14 has been serviced. * Miscellaneous hardware-specific code to execute after panicstr is set * by the panic code: we also print and record PTL1 panic information here. * Turn off TRAPTRACE and save the current %tick value in panic_tick. /* there are no possible error codes for this hcall */ * For Platforms that use CPU signatures, we * need to set the signature block to OS, the state to * exiting, and the substate to panic for all the processors. * Disable further ECC errors from the bus nexus. * Redirect all interrupts to the current CPU. * This call exists solely to support dumps to network * devices after sync from OBP. * If we came here via the sync callback, then on some * platforms, interrupts may have arrived while we were * stopped in OBP. OBP will arrange for those interrupts to * be redelivered if you say "go", but not if you invoke a * client callback like 'sync'. For some dump devices * (network swap devices), we need interrupts to be * delivered in order to dump, so we have to call the bus * nexus driver to reset the interrupt state machines. * Platforms that use CPU signatures need to set the signature block to OS and * the state to exiting for all CPUs. PANIC_CONT indicates that we're about to * write the crash dump, which tells the SSP/SMS to begin a timeout routine to * reboot the machine if the dump never completes. panic(
"ptl1_init_cpu: not enough space left for ptl1_panic " "stack, sizeof (struct cpu) = %lu",
(
unsigned long)
sizeof (
struct cpu));
"trap for debug purpose",
/* PTL1_BAD_DEBUG */ "unknown trap",
/* PTL1_BAD_DEBUG */ "register window trap",
/* PTL1_BAD_WTRAP */ "kernel MMU miss",
/* PTL1_BAD_KMISS */ "kernel protection fault",
/* PTL1_BAD_KPROT_FAULT */ "ISM MMU miss",
/* PTL1_BAD_ISM */ "kernel MMU trap",
/* PTL1_BAD_MMUTRAP */ "kernel trap handler state",
/* PTL1_BAD_TRAP */ "floating point trap",
/* PTL1_BAD_FPTRAP */ "pointer to intr_vec",
/* PTL1_BAD_INTR_VEC */ "unknown trap",
/* PTL1_BAD_INTR_VEC */ "TRACE_PTR state",
/* PTL1_BAD_TRACE_PTR */ "unknown trap",
/* PTL1_BAD_TRACE_PTR */ "stack overflow",
/* PTL1_BAD_STACK */ "DTrace flags",
/* PTL1_BAD_DTRACE_FLAGS */ "attempt to steal locked ctx",
/* PTL1_BAD_CTX_STEAL */ "CPU ECC error loop",
/* PTL1_BAD_ECC */ "unexpected error from hypervisor call",
/* PTL1_BAD_HCALL */ "unexpected global level(%gl)",
/* PTL1_BAD_GL */ "Watchdog Reset",
/* PTL1_BAD_WATCHDOG */ "unexpected RED mode trap",
/* PTL1_BAD_RED */ "return value EINVAL from hcall: "\
"UNMAP_PERM_ADDR",
/* PTL1_BAD_HCALL_UNMAP_PERM_EINVAL */ "return value ENOMAP from hcall: "\
"UNMAP_PERM_ADDR",
/* PTL1_BAD_HCALL_UNMAP_PERM_ENOMAP */ "error raising a TSB exception",
/* PTL1_BAD_RAISE_TSBEXCP */ "missing shared TSB" /* PTL1_NO_SCDTSB8K */ * Use trap_info for a place holder to call panic_savetrap() and * panic_showtrap() to save and print out ptl1_panic information. * Restore the watchdog timer when returning from a debugger * after a panic or L1-A and resume watchdog pat. "dump buffer. Error = 0x%lx, size = 0x%lx," "Available buffer size = 0x%lx," "Minimum buffer size required = 0x%lx",
"buffer. Error = 0x%lx",
ret);
value =
1;
/* boolean properties */ panic(
"stick_frequency property not found in MD");
panic(
"cannot allocate list for MD properties");
panic(
"stick_frequency property not found in MD");
"cpuid: 0x%x has been marked in " "unexpected hypervisor error 0x%x " "while sending a mondo to cpuid: " * If there is a big jump between the current tick * count and lasttick, we have probably hit a break * point. Adjust endtick accordingly to avoid panic. "(target 0x%x) [retries: 0x%x hvstat: 0x%x]",
* Assemble CPU list for HV argument. We already know * smallestid and largestid are members of set. * Either not all CPU mondos were sent, or an * error occurred. CPUs that were sent mondos * have their CPU IDs overwritten in cpu_list. * Reset cpu_list so that it only holds those * CPU IDs that still need to be sent. for (i = 0, j = 0; i <
ncpuids; i++) {
* Now handle possible errors returned * Remove any CPUs in the error state from * cpu_list. At this point cpu_list only * contains the CPU IDs for mondos not "H_ECPUERROR but no CPU in " "cpu_list in error state");
"CPU(s) in error state");
* For all other errors, panic. "hypervisor error 0x%x while sending a " "mondo to cpuid(s):",
stat);
* If there is a big jump between the current tick * count and lasttick, we have probably hit a break * point. Adjust endtick accordingly to avoid panic. "[retries: 0x%x] cpuids: ",
retries);
for (
rc = 0, i = 0; i <
NCPU; i++) {
* Sends a cross-call to a specified processor. The caller assumes * responsibility for repetition of cross-calls, as appropriate (MARSA for * return (KDI_XC_RES_ERR); /* Not required on sun4v architecture. */ * For "mdb -K", set soft state to debugging * check again as the read above may or may not have worked and if * it didn't then soft state will still be -1 * For "mdb -K", set soft_state state back to original state on exit * Routine to return memory information associated * with a physical address and syndrome. * This routine returns the size of the kernel's FRU name buffer. * This routine is a more generic interface to cpu_get_mem_unum(), * that may be used by other modules (e.g. mm). * xt_sync - wait for previous x-traps to finish * Sun4v uses a queue for receiving mondos. Successful * transmission of a mondo only indicates that the mondo * has been written into the queue. * We use an array of bytes to let each cpu to signal back * to the cross trap sender that the cross trap has been * executed. Set the byte to 1 before sending the cross trap * and wait until other cpus reset it to 0. * To help debug xt_sync panic, each mondo is uniquely identified * by passing the tick value, traptrace_id as the second mondo * argument to xt_some which is logged in CPU's mondo queue, * traptrace buffer and the panic message. * If there is a big jump between the current tick * count and lasttick, we have probably hit a break * point. Adjust endtick accordingly to avoid panic. "at cpu_sync.xword[%d]: 0x%lx " "starttick: 0x%lx endtick: 0x%lx " "traptrace_id = 0x%lx\n",
* Recalculate the values of the cross-call timeout variables based * on the value of the 'inter-cpu-latency' property of the platform node. * The property sets the number of nanosec to wait for a cross-call * to be acknowledged. Other timeout variables are derived from it. * N.B. This implementation is aware of the internals of xc_init() * and updates many of the same variables. /* See x_call.c for descriptions of these extern variables. */ /* Temp versions of the target variables */ * Look up the 'inter-cpu-latency' (optional) property in the * platform node of the MD. The units are nanoseconds. "Unable to initialize machine description");
"inter-cpu-latency", &
latency) == -
1)
* clock.h defines an assembly-language macro * (NATIVE_TIME_TO_NSEC_SCALE) to convert from %stick * units to nanoseconds. Since the inter-cpu-latency * units are nanoseconds and the xc_* variables require * %stick units, we need the inverse of that function. * The trick is to perform the calculation without * floating point, but also without integer truncation * or overflow. To understand the calculation below, * please read the discussion of the macro in clock.h. * Since this new code will be invoked infrequently, * we can afford to implement it in C. * tick_scale is the reciprocal of nsec_scale which is * calculated at startup in setcpudelay(). The calc * of tick_limit parallels that of NATIVE_TIME_TO_NSEC_SCALE * except we use tick_scale instead of nsec_scale and * C instead of assembler. * xc_init() calculated 'maxfreq' by looking at all the cpus, * and used it to derive some of the timeout variables that we * recalculate below. We can back into the original value by * using the inverse of one of those calculations. * Don't allow the new timeout (xc_tick_limit) to fall below * the system tick frequency (stick). Allowing the timeout * to be set more tightly than this empirically determined * value may cause panics. * Recalculate xc_scale since it is used in a callback function * (xc_func_timeout_adj) to adjust two of the timeouts dynamically. * Make the change in xc_scale proportional to the change in * Don't modify the timeouts if nothing has changed. Else, * stuff the variables with the freshly calculated (temp) * variables. This minimizes the window where the set of * values could be inconsistent. * Increase the timeout limit for xt_sync() cross calls. * Force the new values to be used for future cross calls. * This is necessary only when we increase the timeouts. * Try to register soft_state api. If it fails, soft_state api has not * been implemented in the firmware, so do not bother to setup * soft_state in the kernel. * Tell OBP that we are supporting Guest State "hv_soft_state_set returned %ld\n",
rc);
"hv_soft_state_get returned %ld\n",
rc);