1N/A * The contents of this file are subject to the terms of the 1N/A * Common Development and Distribution License (the "License"). 1N/A * You may not use this file except in compliance with the License. 1N/A * See the License for the specific language governing permissions 1N/A * and limitations under the License. 1N/A * When distributing Covered Code, include this CDDL HEADER in each 1N/A * If applicable, add the following below this CDDL HEADER, with the 1N/A * fields enclosed by brackets "[]" replaced with your own identifying 1N/A * information: Portions Copyright [yyyy] [name of copyright owner] 1N/A * Copyright (c) 2004, 2010, Oracle and/or its affiliates. All rights reserved. 1N/A * Copyright (c) 2011 by Delphix. All rights reserved. 1N/A * Copyright 2013 Nexenta Systems, Inc. All rights reserved. 1N/A * Copyright 2014 Josef "Jeff" Sipek <jeffpc@josefsipek.net> 1N/A * Copyright (c) 2010, Intel Corporation. 1N/A * All rights reserved. 1N/A * Portions Copyright 2009 Advanced Micro Devices, Inc. 1N/A * Copyright (c) 2015, Joyent, Inc. All rights reserved. 1N/A * Various routines to handle identification 1N/A * and classification of x86 processors. * Pass 0 of cpuid feature analysis happens in locore. It contains special code * to recognize Cyrix processors that are not cpuid-compliant, and to deal with * them accordingly. For most modern processors, feature detection occurs here * Pass 1 of cpuid feature analysis happens just at the beginning of mlsetup() * for the boot CPU and does the basic analysis that the early kernel needs. * x86_featureset is set based on the return value of cpuid_pass1() of the boot * x86_vendor accordingly. * o Processing the feature flags returned by the cpuid instruction while * applying any workarounds or tricks for the specific processor. * o Mapping the feature flags into Solaris feature bits (X86_*). * o Processing extended feature flags if supported by the processor, * again while applying specific processor knowledge. * o Determining the CMT characteristics of the system. * Pass 1 is done on non-boot CPUs during their initialization and the results * are used only as a meager attempt at ensuring that all processors within the * system support the same features. * Pass 2 of cpuid feature analysis happens just at the beginning * of startup(). It just copies in and corrects the remainder * of the cpuid data we depend on: standard cpuid functions that we didn't * need for pass1 feature analysis, and extended cpuid functions beyond the * simple feature processing done in pass1. * Pass 3 of cpuid analysis is invoked after basic kernel services; in * particular kernel memory allocation has been made available. It creates a * readable brand string based on the data collected in the first two passes. * Pass 4 of cpuid analysis is invoked after post_startup() when all * the support infrastructure for various hardware features has been * initialized. It determines which processor features will be reported * to userland via the aux vector. * All passes are executed on all CPUs, but only the boot CPU determines what * features the kernel will use. * Much of the worst junk in this file is for the support of processors * that didn't really implement the cpuid instruction properly. * NOTE: The accessor functions (cpuid_get*) are aware of, and ASSERT upon, * the pass numbers. Accordingly, changes to the pass code may require changes * We assume that the unused bits of the bitmap are always zero. * This is set to platform type we are running on. * Variable to patch if hypervisor platform detection needs to be * disabled (e.g. platform_type will always be HW_NATIVE if this is 0). * size_actual and buf_actual are the real address and size allocated to get * proper mwait_buf alignement. buf_actual and size_actual should be passed * to kmem_free(). Currently kmem_alloc() and mwait happen to both use * processor cache-line alignment, but this is not guarantied in the furture. * This structure contains HW feature bits and size of the xsave save area. * Note: the kernel will use the maximum size required for all hardware * features. It is not optimize for potential memory savings if features at * the end of the save area are not enabled. * These constants determine how many of the elements of the * cpuid we cache in the cpuid_info data structure; the * remaining elements are accessible via the cpuid instruction. * Some terminology needs to be explained: * - Socket: Something that can be plugged into a motherboard. * - Package: Same as socket * - Chip: Same as socket. Note that AMD's documentation uses term "chip" * differently: there, chip is the same as processor node (below) * - Processor node: Some AMD processors have more than one * "subprocessor" embedded in a package. These subprocessors (nodes) * are fully-functional processors themselves with cores, caches, * memory controllers, PCI configuration spaces. They are connected * inside the package with Hypertransport links. On single-node * - Compute Unit: Some AMD processors pair cores in "compute units" that * share the FPU and the I$ and L2 caches. * standard function information * extended function information /* Intel: fn 4: %eax[31-26] */ * supported feature information * Synthesized information, where known. * These bit fields are defined by the Intel Application Note AP-485 * "Intel Processor Identification and the CPUID Instruction" * Function 4 (Deterministic Cache Parameters) macros * Defined by Intel Application Note AP-485 * A couple of shorthand macros to identify "later" P6-family chips * like the Pentium M and Core. First, the "older" P6-based stuff * (loosely defined as "pre-Pentium-4"): * P6, PII, Mobile PII, PII Xeon, PIII, Mobile PIII, PIII Xeon /* A "new F6" is everything with family 6 that's not the above */ * See cpuid section of "Intel 64 and IA-32 Architectures Software Developer's * Manual Volume 2A: Instruction Set Reference, A-M" #25366-022US, November * See MONITOR/MWAIT section of "AMD64 Architecture Programmer's Manual * Documentation Updates" #33633, Rev 2.05, December 2006. * Number of sub-cstates for a given c-state. * XSAVE leaf 0xD enumeration * Functions we consune from cpuid_subr.c; don't publish these in a header * file to try and keep people using the expected cpuid_* interfaces. * Apply up various platform-dependent restrictions where the * underlying platform restrictions mean the CPU can be marked * as less capable than its cpuid instruction would imply. * Zero out the (ncores-per-chip - 1) field * Zero out the (ncores-per-chip - 1) field * Some undocumented ways of patching the results of the cpuid * instruction to permit running Solaris 10 on future cpus that * we don't currently support. Could be set to non-zero values * via settings in eeprom. * Allocate space for mcpu_cpi in the machcpu structure for all non-boot CPUs. * By convention, cpu0 is the boot cpu, which is set up * before memory allocation is available. All other cpus get * their cpuid_info struct allocated here. * Free up any function 4 related dynamic storage * Determine the type of the underlying platform. This is used to customize * initialization of various subsystems (e.g. TSC). determine_platform() must * only ever be called once to prevent two processors from seeing different * values of platform_type. Must be called before cpuid_pass1(), the earliest * consumer to execute (uses _cpuid_chiprev --> synth_amd_info --> get_hwenv). * If Hypervisor CPUID bit is set, try to determine hypervisor * vendor signature, and set platform type accordingly. * Check older VMware hardware versions. VMware hypervisor is * detected by performing an IN operation to VMware hypervisor * port and checking that value returned in %ebx is VMware * hypervisor magic value. * Check Xen hypervisor. In a fully virtualized domain, * Xen's pseudo-cpuid function returns a string representing the * Xen signature in %ebx, %ecx, and %edx. %eax contains the maximum * supported cpuid function. We need at least a (base + 2) leaf value * to do what we want to do. Try different base values, since the * hypervisor might use a different one depending on whether Hyper-V * emulation is switched on by default or not. * Multi-core (and possibly multi-threaded) * 8bit APIC IDs on dual core Pentiums * +-----------------------+------+------+ * | Physical Package ID | MC | HT | * +-----------------------+------+------+ * <------- chipid --------> * <------- coreid ---------------> * Where the number of bits necessary to * represent MC and HT fields together equals * to the minimum number of bits necessary to * store the value of cpi->cpi_ncpu_per_chip. * Of those bits, the MC part uses the number * of bits necessary to store the value of * cpi->cpi_ncore_per_chip. * Single-core multi-threaded processors. * AMD CMP chips currently have a single thread per core. * Since no two cpus share a core we must assign a distinct coreid * per cpu, and we do this by using the cpu_id. This scheme does not, * however, guarantee that sibling cores of a chip will have sequential * coreids starting at a multiple of the number of cores per chip - * that is usually the case, but if the ACPI MADT table is presented * in a different order then we need to perform a few more gymnastics * All processors in the system have the same number of enabled * cores. Cores within a processor are always numbered sequentially * from 0 regardless of how many or which are disabled, and there * is no way for operating system to discover the real core id when some * In family 0x15, the cores come in pairs called compute units. They * share I$ and L2 caches and the FPU. Enumeration of this feature is * simplified by the new topology extensions CPUID leaf, indicated by * the X86 feature X86FSET_TOPOEXT. * In AMD parlance chip is really a node while Solaris /* Assume single-core part */ /* Get node ID, compute unit ID */ * See if we are a multi-node processor. * All processors in the system have the same number of nodes * Multi-node revision D (2 nodes per package /* NodeId[2:1] bits to use for reading F3xe8 */ * Check IntNodeNum bit (31:30, but bit 31 is * always 0 on dual-node processors) * Setup XFeature_Enabled_Mask register. Required by xsave feature. * Space statically allocated for BSP, ensure pointer is set * Limit the range in case of weird hardware * Extract identifying constants for easy access. * Beware: AMD uses "extended model" iff base *FAMILY* == 0xf. * Intel, and presumably everyone else, uses model == 0xf, as * one would expect (max value means possible overflow). Sigh. * - believe %edx feature word * - ignore %ecx feature word * - 32-bit virtual and physical addressing * Clear the SEP bit when it was set erroneously * We don't currently depend on any of the %ecx * features until Prescott, so we'll only check * this from P4 onwards. We might want to revisit * to obtain the monitor linesize. * These CPUs have an incomplete implementation * of MCA/MCE which we mask away. * Model 0 uses the wrong (APIC) bit * to indicate PGE. Fix it here. * Early models had problems w/ MMX; disable. * For newer families, SSE3 and CX16, at least, are valid; * to obtain the monitor linesize. * Do not use MONITOR/MWAIT to halt in the idle loop on any AMD * processors. AMD does not intend MWAIT to be used in the cpu * idle loop on current and future processors. 10h and future * AMD processors use more power in MWAIT than HLT. * Pre-family-10h Opterons do not have the MWAIT instruction. * workaround the NT workaround in CMS 4.1 * workaround the NT workarounds again * We rely heavily on the probing in locore * to actually figure out what parts, if any, * of the Cyrix cpuid instruction to believe. * Do not support XSAVE under a hypervisor for now * Now we've figured out the masks that determine * which bits we choose to believe, apply the masks * to the feature words, then map the kernel's view * of these feature words into its feature word. * apply any platform restrictions (we don't call this * immediately after __cpuid_insn here, because we need the * workarounds applied above first) * In addition to ecx and edx, Intel is storing a bunch of instruction * set extensions in leaf 7's ebx. * If XSAVE has been disabled, just ignore all of the AVX * We check disable_smap here in addition to in startup_smap() * to ensure CPUs that aren't the boot CPU don't accidentally * include it in the feature set and thus generate a mismatched * x86 feature set across CPUs. Note that at this time we only * enable SMAP for the 64-bit kernel. * fold in overrides from the "eeprom" mechanism * are prerequisites before we'll even /* We only test AVX when there is XSAVE */ * Intel says we can't check these without also * We require the CLFLUSH instruction for erratum workaround * All processors we are aware of which have * Only need it first time, rest of the cpus would follow suit. * we only capture this for the bootcpu. * Hyperthreading configuration is slightly tricky on Intel * and pure clones, and even trickier on AMD. * (AMD chose to set the HTT bit on their CMP processors, * even though they're not actually hyperthreaded. Thus it * takes a bit more work to figure out what's really going * on ... see the handling of the CMP_LGCY bit below) * Work on the "extended" feature information, doing * some basic initialization for cpuid_pass2() * Only these Cyrix CPUs are -known- to support * extended cpuid operations. * K6 model 6 uses bit 10 to indicate SYSC * Later models use bit 11. Fix it here. * Compute the additions to the kernel's feature word. * Regardless whether or not we boot 64-bit, * we should have a way to identify whether * the CPU is capable of running 64-bit. /* 1 GB large page - enable only for 64 bit kernel */ * If both the HTT and CMP_LGCY bits are set, * then we're not actually HyperThreaded. Read * "AMD CPUID Specification" for more details. * instead. In the amd64 kernel, things are -way- * While we're thinking about system calls, note * that AMD processors don't support sysenter * in long mode at all, so don't try to program them. * Get CPUID data about processor cores and hyperthreads. * Virtual and physical address limits from * cpuid override previously guessed values. * Derive the number of cores per chip * On family 0xf cpuid fn 2 ECX[7:0] "NC" is * 1 less than the number of physical cores on * the chip. In family 0x10 this value can * be affected by "downcoring" - it reflects * 1 less than the number of cores actually * Get CPUID data about TSC Invariance in Deep C-State. * If more than one core, then this processor is CMP. * If the number of cores is the same as the number * of CPUs, then we cannot have HyperThreading. * Single-core single-threaded processors. * All other processors are currently * assumed to have single cores. * Synthesize chip "revision" and socket type * Make copies of the cpuid table entries we depend on, in * part for ease of parsing now, in part so that we have only * one place to correct any of it, in part for ease of * later export to userland, and in part so we can look at * this stuff in a crash dump. * (We already handled n == 0 and n == 1 in pass 1) * CPUID function 4 expects %ecx to be initialized * with an index which indicates which cache to return * information about. The OS is expected to call function 4 * with %ecx set to 0, 1, 2, ... until it returns with * EAX[4:0] set to 0, which indicates there are no more * Here, populate cpi_std[4] with the information returned by * function 4 when %ecx == 0, and do the rest in cpuid_pass3() * when dynamic memory allocation becomes available. * Note: we need to explicitly initialize %ecx here, since * function 4 may have been previously invoked. * "the lower 8 bits of the %eax register * contain a value that identifies the number * of times the cpuid [instruction] has to be * executed to obtain a complete image of the * processor's caching systems." * How *do* they make this stuff up? * Well, for now, rather than attempt to implement * this slightly dubious algorithm, we just look case 3:
/* Processor serial number, if PSN supported */ case 4:
/* Deterministic cache parameters */ * Protect ourself from insane mwait line size. * Workaround for incomplete hardware emulator(s). * Check CPUID.EAX=0BH, ECX=0H:EBX is non-zero, which * indicates that the extended topology enumeration leaf is /* Make cp NULL so that we don't stumble on others */ * Sanity checks for debug * If the hw supports AVX, get the size and offset in the save * area for the ymm state. /* Broken CPUID 0xD, probably in HVM */ "value: hw_low = %d, hw_high = %d, xsave_size = %d" ", ymm_size = %d, ymm_offset = %d\n",
* This must be a non-boot CPU. We cannot * continue, because boot cpu has already "enabled XSAVE on boot cpu, cannot " * If we reached here on the boot CPU, it's also * almost certain that we'll reach here on the * non-boot CPUs. When we're here on a boot CPU * we should disable the feature, on a non-boot * CPU we need to confirm that we have. * Copy the extended properties, fixing them as we go. * (We already handled n == 0 and n == 1 in pass 1) * Extract the brand string * The Athlon and Duron were the first * parts to report the sizes of the * TLB for large pages. Before then, * we don't trust the data. * The Athlon and Duron were the first * AMD parts with L2 TLB's. * Before then, don't trust the data. * AMD Duron rev A0 reports L2 * cache size incorrectly as 1K * VIA C3 processors are a bit messed * up w.r.t. encoding cache sizes in %ecx * model 7 and 8 were incorrectly encoded * xxx is model 8 really broken? * model 9 stepping 1 has wrong associativity * Extended L2 Cache features function. * First appeared on Prescott. return (
"Intel Pentium(r)");
return (
"Intel Pentium(r) Pro");
return (
"Intel Pentium(r) II");
return (
"Intel Celeron(r)");
for (i =
1; i <
4; i++) {
if (
tmp >=
0x44 &&
tmp <=
0x45)
for (i = 0; i <
2; i++) {
else if (
tmp >=
0x44 &&
tmp <=
0x45)
for (i = 0; i <
4; i++) {
else if (
tmp >=
0x44 &&
tmp <=
0x45)
for (i = 0; i <
4; i++) {
else if (
tmp >=
0x44 &&
tmp <=
0x45)
return (
"Intel Celeron(r)");
"Intel Pentium(r) II Xeon(tm)" :
"Intel Pentium(r) III Xeon(tm)");
"Intel Pentium(r) II or Pentium(r) II Xeon(tm)" :
"Intel Pentium(r) III or Pentium(r) III Xeon(tm)");
/* BrandID is present if the field is nonzero */ {
0x1,
"Intel(r) Celeron(r)" },
{
0x2,
"Intel(r) Pentium(r) III" },
{
0x3,
"Intel(r) Pentium(r) III Xeon(tm)" },
{
0x4,
"Intel(r) Pentium(r) III" },
{
0x6,
"Mobile Intel(r) Pentium(r) III" },
{
0x7,
"Mobile Intel(r) Celeron(r)" },
{
0x8,
"Intel(r) Pentium(r) 4" },
{
0x9,
"Intel(r) Pentium(r) 4" },
{
0xa,
"Intel(r) Celeron(r)" },
{
0xb,
"Intel(r) Xeon(tm)" },
{
0xc,
"Intel(r) Xeon(tm) MP" },
{
0xe,
"Mobile Intel(r) Pentium(r) 4" },
{
0xf,
"Mobile Intel(r) Celeron(r)" },
{
0x11,
"Mobile Genuine Intel(r)" },
{
0x12,
"Intel(r) Celeron(r) M" },
{
0x13,
"Mobile Intel(r) Celeron(r)" },
{
0x14,
"Intel(r) Celeron(r)" },
{
0x15,
"Mobile Genuine Intel(r)" },
{
0x16,
"Intel(r) Pentium(r) M" },
{
0x17,
"Mobile Intel(r) Celeron(r)" }
return (
"Intel(r) Celeron(r)");
return (
"Intel(r) Xeon(tm) MP");
return (
"Intel(r) Xeon(tm)");
return (
"i486 compatible");
return (
"AMD-K6(r)-III");
return (
"AMD (family 5)");
return (
"AMD Athlon(tm)");
return (
"AMD Duron(tm)");
* Use the L2 cache size to distinguish "AMD Athlon(tm)" :
"AMD Duron(tm)");
return (
"AMD (family 6)");
return (
"AMD Opteron(tm) UP 1xx");
return (
"AMD Opteron(tm) DP 2xx");
return (
"AMD Opteron(tm) MP 8xx");
return (
"AMD Opteron(tm)");
return (
"i486 compatible");
return (
"Cyrix MediaGX");
* Have another wild guess .. return (
"Cyrix 6x86");
/* Cyrix M1 */ return (
"Cyrix MediaGX");
return (
"Cyrix 6x86MX");
/* Cyrix M2? */ * This only gets called in the case that the CPU extended * feature brand string (0x80000002, 0x80000003, 0x80000004) * aren't available, or contain null bytes for some reason. brand =
"Transmeta Crusoe TM3x00 or TM5x00";
* This routine is called just after kernel memory allocation * becomes available on cpu0, and as part of mp_startup() on * Fixup the brand string, and collect any information from cpuid * that requires dynamically allocated storage to represent. * Function 4: Deterministic cache parameters * Take this opportunity to detect the number of threads * sharing the last level cache, and construct a corresponding * cache id. The respective cpuid_info members are initialized * to the default case of "no last level cache sharing". * Find the # of elements (size) returned by fn 4, and along * the way detect last level cache sharing details. * Allocate the cpi_std_4 array. The first element * references the regs for fn 4, %ecx == 0, which * cpuid_pass2() stashed in cpi->cpi_std[4]. * Allocate storage to hold the additional regs * for function 4, %ecx == 1 .. cpi_std_4_size. * The regs for fn 4, %ecx == 0 has already * been allocated as indicated above. for (i =
1; i <
size; i++) {
* Determine the number of bits needed to represent * the number of CPUs sharing the last level cache. * Shift off that number of bits from the APIC id to * Now fixup the brand string * If we successfully extracted a brand string from the cpuid * instruction, clean it up by removing leading spaces and * Remove any 'Genuine' or "Authentic" prefixes * Now do an in-place copy. * Map (R) to (r) and (TM) to (tm). * The era of teletypes is long gone, and there's * -really- no need to shout. * Finally, remove any trailing spaces * This routine is called out of bind_hwcap() much later in the life * of the kernel (post_startup()). The job of this routine is to resolve * the hardware feature support and kernel support for those features into * what we're actually going to tell applications via the aux vector. * [these require explicit kernel support] * [no explicit support required beyond x87 fp context] * Now map the supported feature vector to things that we * think userland will care about. * Seems like Intel duplicated what we necessary * here to make the initial crop of 64-bit OS's work. * Hopefully, those are the only "extended" bits * [these features require explicit kernel support] * [no explicit support required beyond * x87 fp context and exception handlers] * Now map the supported feature vector to * things that we think userland will care about. * Intel uses a different bit in the same word. * Simulate the cpuid instruction using the data we previously * captured about this CPU. We try our best to return the truth * about the hardware, independently of kernel support. * CPUID data is cached in two separate places: cpi_std for standard * CPUID functions, and cpi_extd for extended CPUID functions. * The caller is asking for data from an input parameter which * the kernel has not cached. In this case we go fetch from * the hardware and return the data directly to the user. * AMD and Intel both implement the 64-bit variant of the syscall * instruction (syscallq), so if there's -any- support for syscall, * cpuid currently says "yes, we support this". * However, Intel decided to -not- implement the 32-bit variant of the * syscall instruction, so we provide a predicate to allow our caller * to test that subtlety here. * XXPV Currently, 32-bit syscall instructions don't work via the hypervisor, * even in the case where the hardware would in fact support it. static const char fmt[] =
"x86 (%s %X family %d model %d step %d clock %d MHz)";
"x86 (chipid 0x%x %s %X family %d model %d step %d clock %d MHz)";
/* Assume that socket types are the same across the system */ * Returns the number of data TLB entries for a corresponding * pagesize. If it can't be computed, or isn't known, the * routine returns zero. If you ask about an architecturally * impossible pagesize, the routine will panic (so that the * hat implementor knows that things are inconsistent.) * All zero in the top 16 bits of the register * indicates a unified TLB. Size is in low 16 bits. panic(
"unknown L2 pagesize");
* No L2 TLB support for this size, try L1. panic(
"unknown L1 d-TLB pagesize");
* Return 0 if the erratum is not present or not applicable, positive * if it is, and negative if the status of the erratum is unknown. * See "Revision Guide for AMD Athlon(tm) 64 and AMD Opteron(tm) * Processors" #25759, Rev 3.57, August 2005 * Bail out if this CPU isn't an AMD CPU, or if it's * a legacy (32-bit) AMD CPU. #
define JH_E1(
eax) (
eax ==
0x20f10)
/* JH8_E0 had 0x20f30 */ case 51:
/* what does the asterisk mean? */ * Test for AdvPowerMgmtInfo.TscPStateInvariant * if this is a K8 family or newer processor return (((((
eax >>
12) &
0xff00) + (
eax &
0xf00)) |
(((
eax >>
4) &
0xf) | ((
eax >>
12) &
0xf0))) <
0xf40);
* check for processors (pre-Shanghai) that do not provide * optimal management of 1gb ptes in its tlb. * Determine if specified erratum is present via OSVW (OS Visible Workaround). * Return 1 if erratum is present, 0 if not present and -1 if indeterminate. /* confirm OSVW supported */ /* assert that osvw feature setting is consistent on all cpus */ case 298:
/* osvwid is 0 */ /* osvwid 0 is unknown */ * Check the OSVW STATUS MSR to determine the state * 1 - BIOS has applied the workaround when BIOS * workaround is available. (Or for other errata, * OS workaround is required.) * For a value of 1, caller will confirm that the * erratum 298 workaround has indeed been applied by BIOS. * A 1 may be set in cpus that have a HW fix * in a mixed cpu system. Regarding erratum 298: * In a multiprocessor platform, the workaround above * should be applied to all processors regardless of * silicon revision when an affected processor is static const char assoc_str[] =
"associativity";
static const char line_str[] =
"line-size";
* ndi_prop_update_int() is used because it is desirable for * DDI_PROP_HW_DEF and DDI_PROP_DONTSLEEP to be set. * Standard cpuid level 2 gives a randomly ordered * selection of tags that index into a table that describes * cache and tlb properties. * maintain descending order! * 40H - intel_cpuid_4_cache_info() disambiguates l2/l3 cache * f0H/f1H - Currently we do not interpret prefetch size by design {
0x70,
4, 0,
32,
"tlb-4K" },
{
0x80,
4,
16,
16*
1024,
"l1-cache" },
* Search a cache table for a matching entry * Populate cachetab entry with L2 or L3 cache-information using * cpuid function 4. This function is called from intel_walk_cacheinfo() * when descriptor 0x49 is encountered. It returns 0 if no such cache * Walk the cacheinfo descriptor, applying 'func' to every valid element * The walk is terminated if the walker returns non-zero. * For overloaded descriptor 0x49 we use cpuid function 4 * if supported by the current processor, to create * For overloaded descriptor 0xb1 we use X86_PAE flag * to disambiguate the cache information. }
else if (*
dp ==
0xb1) {
* (Like the Intel one, except for Cyrix CPUs) * Search Cyrix-specific descriptor table first .. * .. else fall back to the Intel one * A cacheinfo walker that adds associativity, line-size, and size properties * to the devinfo node it is passed as an argument. static const char fully_assoc[] =
"fully-associative?";
* Extended functions 5 and 6 directly describe properties of * tlbs and various cache levels. case 0:
/* reserved; ignore */ * Most AMD parts have a sectored cache. Multiple cache lines are * associated with each tag. A sector consists of all cache lines * associated with a tag. For example, the AMD K6-III has a sector * size of 2 cache lines per tag. default:
/* reserved; ignore */ * 4M/2M L1 TLB configuration * We report the size for 2M pages because AMD uses two * TLB entries for one 4M page. * 4K L1 TLB configuration * Crusoe processors have 256 TLB entries, but * cpuid data format constrains them to only * Crusoe processors also have a unified TLB * data L1 cache configuration * code L1 cache configuration /* Check for a unified L2 TLB for large pages */ /* Check for a unified L2 TLB for 4K pages */ * There are two basic ways that the x86 world describes it cache * and tlb architecture - Intel's way and AMD's way. * Return which flavor of cache architecture we should use * The K5 model 1 was the first part from AMD that reported * cache sizes via extended cpuid functions. * If they have extended CPU data for 0x80000005 * then we assume they have AMD-format cache * If not, and the vendor happens to be Cyrix, * then try our-Cyrix specific handler. * If we're not Cyrix, then assume we're using Intel's * table-driven format instead. /* cpu-mhz, and clock-frequency */ "clock-frequency", (
int)
mul);
* family, model, and step * AMD K5 model 1 was the first part to support this * brand id first appeared on Pentium III Xeon model 8, * and Celeron model 8 processors and Opteron /* chunks, and apic-id */ * first available on Pentium IV and Opteron (K8) * Brand String first appeared in Intel Pentium IV, AMD K5 * model 1, and Cyrix GXm. On earlier models we try and * simulate something similar .. so this string should always * same -something- about the processor, however lame. * Finally, cache and tlb information * A cacheinfo walker that fetches the size, line-size and associativity return (0);
/* not an L2 -- keep walking */ return (
1);
/* was an L2 -- terminate walk */ * AMD L2/L3 Cache and TLB Associativity Field Definition: * Unlike the associativity for the L1 cache and tlb where the 8 bit * value is the associativity, the associativity for the L2 cache and * tlb is encoded in the following table. The 4 bit L2 value serves as * an index into the amd_afd[] array to determine the associativity. * -1 is undefined. 0 is fully associative. {-
1,
1,
2, -
1,
4, -
1,
8, -
1,
16, -
1,
32,
48,
64,
96,
128, 0};
* kmem_alloc() returns cache line size aligned data for mwait_size * allocations. mwait_size is currently cache line sized. Neither * of these implementation details are guarantied to be true in the * First try allocating mwait_size as kmem_alloc() currently returns * correctly aligned memory. If kmem_alloc() does not return * mwait_size aligned memory, then use mwait_size ROUNDUP. * Set cpi_mwait.buf_actual and cpi_mwait.size_actual in case we * decide to free this memory. * TSC run at a constant rate in all ACPI C-states? * Some AMD processors support C1E state. Entering this state will * cause the local APIC timer to stop, which we can't deal with at /* Disable C1E state if it is enabled by BIOS */ * Setup necessary registers to enable XSAVE feature on this processor. * This function needs to be called early enough, so that no xsave/xrstor * ops will execute on the processor before the MSRs are properly set up. * Current implementation has the following assumption: * - cpuid_pass1() is done, so that X86 features are known. * - fpu_probe() is done, so that fp_save_mech is chosen. /* Enable OSXSAVE in CR4. */ * Starting with the Westmere processor the local * APIC timer will continue running in all C-states, * including the deepest C-states. * Always-running Local APIC Timer is * indicated by CPUID.6.EAX[2]. * Check support for Intel ENERGY_PERF_BIAS feature * Intel ENERGY_PERF_BIAS MSR is indicated by * capability bit CPUID.6.ECX.3 * Check support for TSC deadline timer * TSC deadline timer provides a superior software programming * model over local APIC timer that eliminates "time drifts". * Instead of specifying a relative time, software specifies an * absolute time as the target at which the processor should * generate a timer event. * Patch in versions of bcopy for high performance Intel Nhm processors for (i = 0; i <
cnt; i++) {
#
endif /* __amd64 && !__xpv */ * This function finds the number of bits to represent the number of cores per * chip and the number of strands per core for the Intel platforms. * It re-uses the x2APIC cpuid code of the cpuid_pass2(). /* if the cpuid level is 0xB, extended topo is available. */ * Check CPUID.EAX=0BH, ECX=0H:EBX is non-zero, which * indicates that the extended topology enumeration leaf is * Thread level processor topology * Number of bits shift right APIC ID * Core level processor topology * Number of bits shift right APIC ID