os_linux.cpp revision 548
0N/A * Copyright 1999-2008 Sun Microsystems, Inc. All Rights Reserved. 0N/A * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 0N/A * This code is free software; you can redistribute it and/or modify it 0N/A * under the terms of the GNU General Public License version 2 only, as 0N/A * published by the Free Software Foundation. 0N/A * This code is distributed in the hope that it will be useful, but WITHOUT 0N/A * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 0N/A * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 0N/A * version 2 for more details (a copy is included in the LICENSE file that 0N/A * accompanied this code). 0N/A * You should have received a copy of the GNU General Public License version 0N/A * 2 along with this work; if not, write to the Free Software Foundation, 0N/A * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 0N/A * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, 0N/A * CA 95054 USA or visit www.sun.com if you need additional information or 0N/A * have any questions. 0N/A// do not include precompiled header file 0N/A// put OS-includes here 0N/A// for timer info max values which include all bits 0N/A//////////////////////////////////////////////////////////////////////////////// 0N/A// For diagnostics to print a message once. see run_periodic_checks 0N/A/* do not use any signal number less than SIGSEGV, see 4355769 */ 0N/A/* Used to protect dlsym() calls */ 0N/A//////////////////////////////////////////////////////////////////////////////// 0N/A // values in struct sysinfo are "unsigned long" 0N/A // See comments under solaris for alignment considerations 0N/A//////////////////////////////////////////////////////////////////////////////// 0N/A// environment support 0N/A if (
len > 0)
buf[0] = 0;
// return a null string 0N/A// Return true if user is running as root. 0N/A// i386: 224, ia64: 1105, amd64: 186, sparc 143 0N/A// Cpu architecture string 0N/A// Returns the kernel thread id of the currently running thread. Kernel 0N/A// thread id is used to access /proc. 0N/A// (Note that getpid() on LinuxThreads returns kernel thread id too; but 0N/A// on NPTL, it returns the same pid for all threads, as required by POSIX.) 0N/A // old kernel, no NPTL support 0N/A// Most versions of linux have a bug where the number of processors are 0N/A// determined by looking at the /proc file system. In a chroot environment, 0N/A// the system call returns 1. This causes the VM to act as if it is 0N/A// a single processor and elide locking (see is_MP() call). 0N/A "Java may be unstable running multithreaded in a chroot " 0N/A "environment on Linux when /proc filesystem is not mounted.";
0N/A// sysinfo(SI_ARCHITECTURE, arch, sizeof(arch)); 0N/A // The next steps are taken in the product version: 0N/A // Obtain the JAVA_HOME value from the location of libjvm[_g].so. 0N/A // This library should be located at: 0N/A // <JAVA_HOME>/jre/lib/<arch>/{client|server}/libjvm[_g].so. 0N/A // If "/jre/lib/" appears at the right place in the path, then we 0N/A // assume libjvm[_g].so is installed in a JDK and we use this path. 0N/A // Otherwise exit with message: "Could not create the Java virtual machine." 0N/A // The following extra steps are taken in the debugging version: 0N/A // If "/jre/lib/" does NOT appear at the right place in the path 0N/A // instead of exit check for $JAVA_HOME environment variable. 0N/A // it looks like libjvm[_g].so is installed there 0N/A // Important note: if the location of libjvm.so changes this 0N/A // code needs to be changed accordingly. 0N/A // The next few definitions allow the code to be verbatim: 0N/A * The linker uses the following search paths to locate required 0N/A * 7: The default directories, normally /lib and /usr/lib. 0N/A /* sysclasspath, java_home, dll_dir */ 0N/A // Now cut the path to <java_home>/jre if we can. 0N/A *
pslash =
'\0';
/* get rid of /{client|server|hotspot} */ 0N/A * Where to look for native libraries 0N/A * Note: Due to a legacy implementation, most of the library path 0N/A * is set in the launcher. This was to accomodate linking restrictions 0N/A * on legacy Linux implementations (which are no longer supported). 0N/A * Eventually, all the library path setting will be done here. 0N/A * However, to prevent the proliferation of improperly built native 0N/A * Eventually, all the library path setting will be done here. 0N/A * Construct the invariant part of ld_library_path. Note that the 0N/A * space for the colon and the trailing null are provided by the 0N/A * nulls included by the sizeof operator (so actually we allocate 0N/A * a byte more than necessary). 0N/A * Get the user setting of LD_LIBRARY_PATH, and prepended it. It 0N/A * should always exist (until the legacy problem cited above is 0N/A /* That's +1 for the colon and +1 for the trailing '\0' */ 0N/A * Extensions directories. 0N/A * Note that the space for the colon and the trailing null are provided 0N/A * by the nulls included by the sizeof operator (so actually one byte more 0N/A * than necessary is allocated). 0N/A /* Endorsed standards default directory. */ 0N/A//////////////////////////////////////////////////////////////////////////////// 0N/A// breakpoint support 0N/A // use debugger to set breakpoint here 0N/A//////////////////////////////////////////////////////////////////////////////// 0N/A // Should also have an assertion stating we are still single-threaded. 0N/A // Fill in signals that are necessarily unblocked for all threads in 0N/A // the VM. Currently, we unblock the following signals: 0N/A // SHUTDOWN{1,2,3}_SIGNAL: for shutdown hooks support (unless over-ridden 0N/A // by -Xrs (=ReduceSignalUsage)); 0N/A // BREAK_SIGNAL which is unblocked only by the VM thread and blocked by all 0N/A // other threads. The "ReduceSignalUsage" boolean tells us not to alter 0N/A // the dispositions or masks wrt these signals. 0N/A // Programs embedding the VM that want to use the above signals for their 0N/A // own purposes must, at this time, use the "-Xrs" option to prevent 0N/A // interference with shutdown hooks and BREAK_SIGNAL thread dumping. 0N/A // (See bug 4345157, and other related bugs). 0N/A // In reality, though, unblocking these signals is really a nop, since 0N/A // these signals are not blocked by default. 0N/A // Fill in signals that are blocked by all but the VM thread. 0N/A// These are signals that are unblocked while a thread is running Java. 0N/A// (For some reason, they get blocked by default.) 0N/A// These are the signals that are blocked while a (non-VM) thread is 0N/A// running Java. Only the VM thread handles these signals. 0N/A// These are signals that are blocked during cond_wait to allow debugger in 0N/A //Save caller's signal mask before setting VM signal mask 0N/A // Only the VM thread handles BREAK_SIGNAL ... 0N/A // ... all other threads block BREAK_SIGNAL 0N/A////////////////////////////////////////////////////////////////////////////// 0N/A// detecting pthread library 0N/A // Save glibc and pthread version strings. Note that _CS_GNU_LIBC_VERSION 0N/A // and _CS_GNU_LIBPTHREAD_VERSION are supported in glibc >= 2.3.2. Use a 0N/A // generic name for earlier versions. 0N/A // Define macros here so we can build HotSpot on old systems. 0N/A // _CS_GNU_LIBC_VERSION is not supported, try gnu_get_libc_version() 0N/A // Vanilla RH-9 (glibc 2.3.2) has a bug that confstr() always tells 0N/A // us "NPTL-0.29" even we are running with LinuxThreads. Check if this 0N/A // is the case. LinuxThreads has a hard limit on max number of threads. 0N/A // So sysconf(_SC_THREAD_THREADS_MAX) will return a positive value. 0N/A // On the other hand, NPTL does not have such a limit, sysconf() 0N/A // will return -1 and errno is not changed. Check if it is really NPTL. 0N/A // glibc before 2.3.2 only has LinuxThreads. 0N/A // LinuxThreads have two flavors: floating-stack mode, which allows variable 0N/A // stack size; and fixed-stack mode. NPTL is always floating-stack. 0N/A///////////////////////////////////////////////////////////////////////////// 0N/A// Force Linux kernel to expand current thread stack. If "bottom" is close 0N/A// to the stack guard, caller should block all signals. 0N/A// A special mmap() flag that is used to implement thread stacks. It tells 0N/A// kernel that the memory region should extend downwards when needed. This 0N/A// allows early versions of LinuxThreads to only mmap the first few pages 0N/A// when creating a new thread. Linux kernel will automatically expand thread 0N/A// stack as needed (on page faults). 0N/A// However, because the memory region of a MAP_GROWSDOWN stack can grow on 0N/A// demand, if a page fault happens outside an already mapped MAP_GROWSDOWN 0N/A// region, it's hard to tell if the fault is due to a legitimate stack 0N/A// overrun). As a rule, if the fault happens below current stack pointer, 0N/A// Linux kernel does not expand stack, instead a SIGSEGV is sent to the 0N/A// This Linux feature can cause SIGSEGV when VM bangs thread stack for 0N/A// stack overflow detection. 0N/A// Newer version of LinuxThreads (since glibc-2.2, or, RH-7.x) and NPTL do 0N/A// not use this flag. However, the stack of initial thread is not created 0N/A// by pthread, it is still MAP_GROWSDOWN. Also it's possible (though 0N/A// unlikely) that user code can create a thread with MAP_GROWSDOWN stack 0N/A// and then attach the thread to JVM. 0N/A// To get around the problem and allow stack banging on Linux, we need to 0N/A// manually expand thread stack after receiving the SIGSEGV. 0N/A// There are two ways to expand thread stack to address "bottom", we used 0N/A// both of them in JVM before 1.5: 0N/A// 1. adjust stack pointer first so that it is below "bottom", and then 0N/A// 2. mmap() the page in question 0N/A// Now alternate signal stack is gone, it's harder to use 2. For instance, 0N/A// if current sp is already near the lower end of page 101, and we need to 0N/A// call mmap() to map page 100, it is possible that part of the mmap() frame 0N/A// will be placed in page 100. When page 100 is mapped, it is zero-filled. 0N/A// That will destroy the mmap() frame and cause VM to crash. 0N/A// The following code works by adjusting sp first, then accessing the "bottom" 0N/A// page to force a page fault. Linux kernel will then automatically expand the 0N/A// _expand_stack_to() assumes its frame size is less than page size, which 0N/A// should always be true if the function is not inlined. 0N/A#
if __GNUC__ <
3 // gcc 2.x does not support noinline attribute 0N/A // Adjust bottom to point to the largest address within the same page, it 0N/A // gives us a one-page buffer if alloca() allocates slightly more memory. 0N/A // sp might be slightly above current stack pointer; if that's the case, we 0N/A // will alloca() a little more space than necessary, which is OK. Don't use 0N/A // os::current_stack_pointer(), as its result can be slightly below current 0N/A // stack pointer, causing us to not alloca enough to reach "bottom". 0N/A////////////////////////////////////////////////////////////////////////////// 0N/A// check if it's safe to start a new thread 0N/A // Fixed stack LinuxThreads (SuSE Linux/x86, and some versions of Redhat) 0N/A // Heap is mmap'ed at lower end of memory space. Thread stacks are 0N/A // allocated (MAP_FIXED) from high address space. Every thread stack 0N/A // occupies a fixed size slot (usually 2Mbytes, but user can change 0N/A // it to other values if they rebuild LinuxThreads). 0N/A // Problem with MAP_FIXED is that mmap() can still succeed even part of 0N/A // the memory region has already been mmap'ed. That means if we have too 0N/A // many threads and/or very large heap, eventually thread stack will 0N/A // collide with heap. 0N/A // Here we try to prevent heap/stack collision by comparing current 0N/A // stack bottom with the highest address that has been mmap'ed by JVM 0N/A // plus a safety margin for memory maps created by native code. 0N/A // This feature can be disabled by setting ThreadSafetyMargin to 0 0N/A // not safe if our stack extends below the safety margin 0N/A // Floating stack LinuxThreads or NPTL: 0N/A // Unlike fixed stack LinuxThreads, thread stacks are not MAP_FIXED. When 0N/A // there's not enough space left, pthread_create() will fail. If we come 0N/A // here, that means enough space has been reserved for stack. 0N/A// Thread start routine for all newly created threads 0N/A // Try to randomize the cache line index of hot stack frames. 0N/A // This helps when threads of the same stack traces evict each other's 0N/A // cache lines. The threads can be either from the same JVM instance, or 0N/A // from different JVM instances. The benefit is especially true for 0N/A // processors with hyperthreading technology. 0N/A // non floating stack LinuxThreads needs extra check, see above 0N/A // notify parent thread 0N/A // thread_id is kernel thread id (similar to Solaris LWP id) 0N/A // initialize signal mask for this thread 0N/A // initialize floating point control register 0N/A // handshaking with parent thread 0N/A // notify parent thread 0N/A // wait until os::start_thread() 0N/A // call one more level start routine 0N/A // Allocate the OSThread object 0N/A // set the correct thread state 0N/A // Initial state is ALLOCATED but not INITIALIZED 0N/A // init thread attributes 0N/A // calculate stack size if it's not specified by caller 0N/A // Java threads use ThreadStackSize which default value can be changed with the flag -Xss 0N/A }
// else fall through: 0N/A // use VMThreadStackSize if CompilerThreadStackSize is not defined 0N/A // let pthread_create() pick the default value. 0N/A // Serialize thread creation if we are running with fixed stack LinuxThreads 0N/A // Need to clean up stuff we've allocated so far 0N/A // Store pthread info into the OSThread 0N/A // Wait until child thread is either initialized or aborted 0N/A // Aborted due to thread limit being reached 0N/A // The thread is returned suspended (in state INITIALIZED), 0N/A // and is started higher up in the call chain 0N/A///////////////////////////////////////////////////////////////////////////// 0N/A// attach existing thread 0N/A// bootstrap the main thread 0N/A // Allocate the OSThread object 0N/A // Store pthread info into the OSThread 0N/A // initialize floating point control register 0N/A // Initial thread state is RUNNABLE 0N/A // If current thread is initial thread, its stack is mapped on demand, 0N/A // see notes about MAP_GROWSDOWN. Here we try to force kernel to map 0N/A // the entire stack region to avoid SEGV in stack banging. 0N/A // It is also useful to get around the heap-stack-gap problem on SuSE 0N/A // kernel (see 4821821 for details). We first expand stack to the top 0N/A // of yellow zone, then enable stack yellow zone (order is significant, 0N/A // enabling yellow zone first will crash JVM on SuSE Linux), so there 0N/A // is no gap between the last two virtual memory regions. 0N/A // initialize signal mask for this thread 0N/A // and save the caller's signal mask 0N/A// Free Linux resources related to the OSThread 0N/A // Restore caller's signal mask 0N/A////////////////////////////////////////////////////////////////////////////// 0N/A// thread local storage 0N/A// Note: This is currently not used by VM, as we don't destroy TLS key 0N/A////////////////////////////////////////////////////////////////////////////// 0N/A// Check if current thread is the initial thread, similar to Solaris thr_main. 0N/A // If called before init complete, thread stack bottom will be null. 0N/A // Can be called if fatal error occurs before initialization. 0N/A "os::init did not locate initial thread's stack region");
0N/A// Find the virtual memory area that contains addr 0N/A// Locate initial thread stack. This special handling of initial thread stack 0N/A// is needed because pthread_getattr_np() on most (all?) Linux distros returns 0N/A// bogus value for initial thread. 0N/A // stack size is the easy part, get it from RLIMIT_STACK 0N/A // 6308388: a bug in ld.so will relocate its own .data section to the 0N/A // lower end of primordial stack; reduce ulimit -s value a little bit 0N/A // so we won't install guard page on ld.so's data section. 0N/A // 4441425: avoid crash with "unlimited" stack size on SuSE 7.1 or Redhat 0N/A // 7.1, in both cases we will get 2G in return value. 0N/A // 4466587: glibc 2.2.x compiled w/o "--enable-kernel=2.4.0" (RH 7.0, 0N/A // SuSE 7.2, Debian) can not handle alternate signal stack correctly 0N/A // for initial thread if its stack size exceeds 6M. Cap it at 2M, 0N/A // in case other parts in glibc still assumes 2M max stack size. 0N/A // FIXME: alt signal stack is gone, maybe we can relax this constraint? 0N/A // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little small 0N/A // Try to figure out where the stack base (top) is. This is harder. 0N/A // When an application is started, glibc saves the initial stack pointer in 0N/A // a global variable "__libc_stack_end", which is then used by system 0N/A // libraries. __libc_stack_end should be pretty close to stack top. The 0N/A // variable is available since the very early days. However, because it is 0N/A // a private interface, it could disappear in the future. 0N/A // Linux kernel saves start_stack information in /proc/<pid>/stat. Similar 0N/A // to __libc_stack_end, it is very close to stack top, but isn't the real 0N/A // stack top. Note that /proc may not exist if VM is running as a chroot 0N/A // program, so reading /proc/<pid>/stat could fail. Also the contents of 0N/A // /proc/<pid>/stat could change in the future (though unlikely). 0N/A // We try __libc_stack_end first. If that doesn't work, look for 0N/A // /proc/<pid>/stat. If neither of them works, we use current stack pointer 0N/A // as a hint, which should work well in most cases. 0N/A // try __libc_stack_end first 0N/A // Figure what the primordial thread stack base is. Code is inspired 0N/A // followed by command name surrounded by parentheses, state, etc. 0N/A // Skip pid and the command string. Note that we could be dealing with 0N/A // weird command names, e.g. user could decide to rename java launcher 0N/A // to "java 1.4.2 :)", then the stat file would look like 0N/A // 1234 (java 1.4.2 :)) R ... ... 0N/A // We don't really need to know the command string, just find the last 0N/A // occurrence of ")" and then start parsing from there. See bug 4726580. 0N/A /* 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 */ 0N/A /* 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 */ 0N/A i =
sscanf(s,
"%c %d %d %d %d %d %lu %lu %lu %lu %lu %lu %lu %ld %ld %ld %ld %ld %ld " 0N/A // product mode - assume we are the initial thread, good luck in the 0N/A warning(
"Can't detect initial thread stack location - bad conversion");
0N/A // FreeBSD with a Linux emulator, or inside chroot), this should work for 0N/A // most cases, so don't abort: 0N/A // Now we have a pointer (stack_start) very close to the stack top, the 0N/A // next thing to do is to figure out the exact location of stack top. We 0N/A // can find out the virtual memory area that contains stack_start by 0N/A // and its upper limit is the real stack top. (again, this would fail if 0N/A // running inside chroot, because /proc may not exist.) 0N/A // success, "high" is the true stack top. (ignore "low", because initial 0N/A // thread stack grows on demand, its real bottom is high - RLIMIT_STACK.) 0N/A warning(
"Can't detect initial thread stack location - find_vma failed");
0N/A // best effort: stack_start is normally within a few pages below the real 0N/A // stack top, use it as stack top, and reduce stack size so we won't put 0N/A // guard page outside stack. 0N/A // stack_top could be partially down the page so align it 0N/A//////////////////////////////////////////////////////////////////////////////// 0N/A// Time since start-up in seconds to a fine granularity. 0N/A// Used by VMSelfDestructTimer and the MemProfiler. 0N/A return (
1000 *
1000);
0N/A// For now, we say that linux does not support vtime. I have no idea 0N/A// whether it can actually be made to (DLD, 9/13/05). 0N/A // better than nothing, but not much 0N/A // we do dlopen's in this particular order due to bug in linux 0N/A // dynamical loader (see 6348968) leading to crash on exit 0N/A // See if monotonic clock is supported by the kernel. Note that some 0N/A // early implementations simply return kernel jiffies (updated every 0N/A // 1/100 or 1/1000 second). It would be bad to use such a low res clock 0N/A // for nano time (though the monotonic property is still nice to have). 0N/A // It's fixed in newer kernels, however clock_getres() still returns 0N/A // 1/HZ. We check if clock_getres() works, but will ignore its reported 0N/A // resolution for now. Hopefully as people move to new kernels, this 0N/A // won't be a problem. 0N/A // yes, monotonic clock is supported 0N/A // close librt if there is no monotonic clock 0N/A // Switch to using fast clocks for thread cpu time if 0N/A // the sys_clock_getres() returns 0 error code. 0N/A // Note, that some kernels may support the current thread 0N/A // clock (CLOCK_THREAD_CPUTIME_ID) but not the clocks 0N/A // returned by the pthread_getcpuclockid(). 0N/A // If the fast Posix clocks are supported then the sys_clock_getres() 0N/A // must return at least tp.tv_sec == 0 which means a resolution 0N/A // better than 1 sec. This is extra check for reliability. 0N/A // CLOCK_MONOTONIC - amount of time since some arbitrary point in the past 0N/A // gettimeofday - based on time in seconds since the Epoch thus does not wrap 0N/A // gettimeofday is a real time clock so it skips 0N/A// Return the real, user, and system times in seconds from an 0N/A// arbitrary fixed point in the past. 0N/A//////////////////////////////////////////////////////////////////////////////// 0N/A// runtime exit support 0N/A// Note: os::shutdown() might be called very early during initialization, or 0N/A// called from signal handler. Before adding something to os::shutdown(), make 0N/A// sure it is async-safe and can handle partially initialized VM. 0N/A // allow PerfMemory to attempt cleanup of any persistent resources 0N/A // needs to remove object in file system 0N/A // flush buffered output, finish log files 0N/A // Check for abort hook 0N/A// Note: os::abort() might be called very early during initialization, or 0N/A// called from signal handler. Before adding something to os::abort(), make 0N/A// sure it is async-safe and can handle partially initialized VM. 0N/A// Die immediately, no exit hook, no abort hook, no cleanup. 0N/A // _exit() on LinuxThreads only kills current thread 0N/A// unused on linux for now. 0N/A // Under the old linux thread library, linux gives each thread 0N/A // its own process id. Because of this each thread will return 0N/A // a different pid if this method were to return the result 0N/A // of getpid(2). Linux provides no api that returns the pid 0N/A // of the launcher thread for the vm. This implementation 0N/A // returns a unique pid, the pid of the launcher thread 0N/A // that starts the vm 'process'. 0N/A // Under the NPTL, getpid() returns the same pid as the 0N/A // launcher thread rather than a unique pid per thread. 0N/A // Use gettid() if you want the old pre NPTL behaviour. 0N/A // if you are looking for the result of a call to getpid() that 0N/A // returns a unique pid for the calling thread, then look at the 0N/A // copied from libhpi 0N/A /* Quietly truncate on buffer overflow. Should be an error. */ 0N/A// check if addr is inside libjvm[_g].so 0N/A // iterate through all loadable segments 0N/A // base address of a library is the lowest address of its loaded 0N/A // see if 'addr' is within current segment 0N/A // dlpi_name is NULL or empty if the ELF file is executable, return 0 0N/A // so dll_address_to_library_name() can fall through to use dladdr() which 0N/A // can figure out executable name from argv[0]. 0N/A // There is a bug in old glibc dladdr() implementation that it could resolve 0N/A // to wrong library name if the .so file has a base address != NULL. Here 0N/A // we iterate through the program headers of all loaded libraries to find 0N/A // out which library 'addr' really belongs to. This workaround can be 0N/A // removed once the minimum requirement for glibc is moved to 2.3.x. 0N/A // buf already contains library name 0N/A // Loads .dll/.so and 0N/A // in case of error it checks if .dll/.so was built for the 0N/A // same architecture as Hotspot is running on 0N/A // Successful loading 0N/A // Read system error message into ebuf 0N/A // It may or may not be overwritten below 0N/A // No more space in ebuf for additional diagnostics message 0N/A // Can't open library, report dlerror() message 0N/A // file i/o error - report dlerror() msg 0N/A char*
name;
// String representation 0N/A // Identify compatability class for VM's architecture and library's architecture 0N/A // Obtain string descriptions for architectures 0N/A "Didn't find running architecture code (running_arch_code) in arch_array");
0N/A // Even though running architecture detection failed 0N/A // we may still continue with reporting dlerror() message 0N/A " (Possible cause: can't load %s-bit .so on a %s-bit platform)",
0N/A " (Possible cause: can't load this .so (machine code=0x%x) on a %s-bit platform)",
0N/A * glibc-2.0 libdl is not MT safe. If you are building with any glibc, 0N/A * chances are you might want to run the generated bits against glibc-2.0 0N/A * libdl.so, so always use locking for any version of glibc. 0N/A st->
print(
"Can not get library information for pid = %d\n",
pid);
0N/A // Try to identify popular distros. 0N/A // so the order is important. 0N/A // Print warning if unsafe chroot environment detected 0N/A // values in struct sysinfo are "unsigned long" 0N/A// but they're the same for all the linux arch that we support 0N/A// and they're the same for solaris but there's no common place to put this. 0N/Aconst char *
ill_names[] = {
"ILL0",
"ILL_ILLOPC",
"ILL_ILLOPN",
"ILL_ILLADR",
0N/A "ILL_ILLTRP",
"ILL_PRVOPC",
"ILL_PRVREG",
0N/A "ILL_COPROC",
"ILL_BADSTK" };
0N/Aconst char *
fpe_names[] = {
"FPE0",
"FPE_INTDIV",
"FPE_INTOVF",
"FPE_FLTDIV",
0N/A "FPE_FLTOVF",
"FPE_FLTUND",
"FPE_FLTRES",
0N/A "FPE_FLTINV",
"FPE_FLTSUB",
"FPE_FLTDEN" };
0N/Aconst char *
segv_names[] = {
"SEGV0",
"SEGV_MAPERR",
"SEGV_ACCERR" };
0N/Aconst char *
bus_names[] = {
"BUS0",
"BUS_ADRALN",
"BUS_ADRERR",
"BUS_OBJERR" };
0N/A st->
print(
"\n\nError accessing class data sharing archive." \
0N/A " Mapped file inaccessible during execution, " \
0N/A assert(
false,
"must use a large-enough buffer");
0N/A // Lazy resolve the path to current module. 0N/A // Support for the gamma launcher. Typical value for buf is 0N/A // the right place in the string, then assume we are installed in a JDK and 0N/A // we're done. Otherwise, check for a JAVA_HOME environment variable and fix 0N/A // up the path so it looks like libjvm.so is installed there (append a 0N/A for (--p; p >
buf && *p !=
'/'; --p)
0N/A // Look for JAVA_HOME in the environment. 0N/A // Use current module name "libjvm[_g].so" instead of 0N/A // "libjvm"debug_only("_g")".so" since for fastdebug version 0N/A // we should have "libjvm.so" but debug_only("_g") adds "_g"! 0N/A // It is used when we are choosing the HPI library's name 0N/A // "libhpi[_g].so" in hpi::initialize_get_interface(). 0N/A // Go back to path of .so 0N/A // no prefix required, not even "_" 0N/A // no suffix required 0N/A//////////////////////////////////////////////////////////////////////////////// 0N/A// sun.misc.Signal support 0N/A // 4511530 - sem_post is serialized and handled by the manager thread. When 0N/A // the program is interrupted by Ctrl-C, SIGINT is sent to every thread. We 0N/A // don't want to flood the manager thread with sem_post requests. 0N/A // Ctrl-C is pressed during error reporting, likely because the error 0N/A // handler fails to abort. Let VM die immediately. 0N/A // -1 means registration failed 0N/A * The following code is moved from os.cpp for making this 0N/A * code platform specific, which it is by its very nature. 0N/A// Will be modified when max signal is changed to be dynamic 0N/A// a counter for each possible signal value 0N/A// Linux(POSIX) specific hand shaking semaphore. 0N/A // Initialize signal structures 0N/A // Initialize signal semaphore 0N/A for (
int i = 0; i <
NSIG +
1; i++) {
0N/A // cleared by handle_special_suspend_equivalent_condition() or java_suspend_self() 0N/A // were we externally suspended while we were waiting? 0N/A // The semaphore has been incremented, but while we were waiting 0N/A // another thread suspended us. We don't want to continue running 0N/A // while suspended because that would surprise the thread that 0N/A//////////////////////////////////////////////////////////////////////////////// 0N/A // Seems redundant as all get out 0N/A// Solaris allocates memory by pages. 0N/A// Rationale behind this function: 0N/A// current (Mon Apr 25 20:12:18 MSD 2005) oprofile drops samples without executable 0N/A// mapping for address (see lookup_dcookie() in the kernel module), thus we cannot get 0N/A// samples for JITted code. Here we create private executable mapping over the code cache 0N/A// and then we can use standard (well, almost, as mapping can change) way to provide 0N/A// info for the reporting script by storing timestamp and location of symbol 0N/A// NOTE: Linux kernel does not really reserve the pages for us. 0N/A// All it does is to check if there are enough free pages 0N/A// left at the time of mmap(). This could be a potential 0N/A // sched_getcpu() should be in libc. 0N/A // Create a cpu -> node mapping 0N/A// rebuild_cpu_to_node_map() constructs a table mapping cpud id to node id. 0N/A// The table is later used in get_node_by_cpu(). 0N/A const size_t NCPUS =
32768;
// Since the buffer size computation is very obscure 0N/A // in libnuma (possible values are starting from 16, 0N/A // and continuing up with every other power of 2, but less 0N/A // than the maximum number of CPUs supported by kernel), and 0N/A // is a subject to change (in libnuma version 2 the requirements 0N/A // are more reasonable) we'll just hardcode the number they use 0N/A// If 'fixed' is true, anon_mmap() will attempt to reserve anonymous memory 0N/A// at 'requested_addr'. If there are existing memory mappings at the same 0N/A// location, however, they will be overwritten. If 'fixed' is false, 0N/A// 'requested_addr' is only treated as a hint, the return value may or 0N/A// may not start from the requested address. Unlike Linux mmap(), this 0N/A// function returns NULL to indicate failure. 0N/A // anon_mmap() should only get called during VM initialization, 0N/A // don't need lock (actually we can skip locking even it can be called 0N/A // from multiple threads, because _highest_vm_reserved_address is just a 0N/A // hint about the upper limit of non-stack memory regions.) 0N/A// Don't update _highest_vm_reserved_address, because there might be memory 0N/A// regions above addr + size. If so, releasing a memory region only creates 0N/A// a hole in the address space, it doesn't help prevent heap-stack collision. 0N/A // Linux wants the mprotect address argument to be page aligned. 0N/A // According to SUSv3, mprotect() should only be used with mappings 0N/A // established by mmap(), and mmap() always maps whole pages. Unaligned 0N/A // 'addr' likely indicates problem in the VM (e.g. trying to change 0N/A // protection of malloc'ed or statically allocated memory). Check the 0N/A // caller if you hit this assert. 0N/A// Set protections specified // is_committed is unused. // large_page_size on Linux is used to round up heap size. x86 uses either // 2M or 4M page, depending on whether PAE (Physical Address Extensions) // mode is enabled. AMD64/EM64T uses 2M page in 64bit mode. IA64 can use // page as large as 256M. // Here we try to figure out page size by parsing /proc/meminfo and looking // for a line with the following format: // If we can't determine the value (e.g. /proc is not mounted, or the text // format has been changed), we'll use the largest page size supported by if (
fscanf(
fp,
"Hugepagesize: %d", &x) ==
1) {
if (
ch ==
EOF ||
ch == (
int)
'\n')
break;
// Large page support is available on 2.6 or newer kernel, some vendors // (e.g. Redhat) have backported it to their 2.4 based distributions. // We optimistically assume the support is available. If later it turns out // not true, VM will automatically switch to use regular page size. // Create a large shared memory region to attach to based on size. // Currently, size is the total size of the heap // Possible reasons for shmget failure: // 1. shmmax is too small for Java heap. // 2. not enough large page memory. // > increase amount of large pages: // Note 1: different Linux may use different name for this property, // e.g. on Redhat AS-3 it is "hugetlb_pool". // Note 2: it's possible there's enough physical memory available but // they are so fragmented after a long run that they can't // coalesce into large pages. Try to reserve large pages when // the system is still "fresh". // Remove shmid. If shmat() is successful, the actual shared memory segment // will be deleted when it's detached by shmdt() or when the process // terminates. If shmat() is not successful this will remove the shared // detaching the SHM segment will also delete it, see reserve_memory_special() // Linux does not support anonymous mmap with large page memory. The only way // to reserve large page memory without file backing is through SysV shared // memory API. The entire memory region is committed and pinned upfront. // Hopefully this will change in the future... // Reserve memory at an arbitrary address, only if that area is // available (and not reserved for something else). // Assert only that the size is a multiple of the page size, since // that's all that mmap requires, and since that's all we really know // about at this low abstraction level. If we need higher alignment, // we can either pass an alignment to this method or verify alignment // in one of the methods further up the call chain. See bug 5044738. // Repeatedly allocate blocks until the block is allocated at the // right spot. Give up after max_tries. Note that reserve_memory() will // automatically update _highest_vm_reserved_address if the call is // successful. The variable tracks the highest memory address every reserved // by JVM. It is used to detect heap-stack collision if running with // fixed-stack LinuxThreads. Because here we may attempt to reserve more // space than needed, it could confuse the collision detecting code. To // solve the problem, save current _highest_vm_reserved_address and // calculate the correct value before return. // Linux mmap allows caller to pass an address as hint; give it a try first, // if kernel honors the hint then we can return immediately. // mmap() is successful but it fails to reserve at the requested address // Is this the block we wanted? // Does this overlap the block we wanted? Give back the overlapped // Give back the unused reserved pieces. for (
int j = 0; j < i; ++j) {
// TODO-FIXME: reconcile Solaris' os::sleep with the linux variation. // Solaris uses poll(), linux uses park(). // Poll() is likely a better choice, assuming that Thread.interrupt() // generates a SIGUSRx signal. Note that SIGUSR1 can interfere with // time moving backwards, should only happen if no monotonic clock // not a guarantee() because JVM should not abort on kernel/glibc bugs // cleared by handle_special_suspend_equivalent_condition() or // java_suspend_self() via check_and_wait_while_suspended() // were we externally suspended while we were waiting? // It'd be nice to avoid the back-to-back javaTimeNanos() calls on // time moving backwards, should only happen if no monotonic clock // not a guarantee() because JVM should not abort on kernel/glibc bugs // %% make the sleep time an integer flag. for now use 1 millisec. // Sleep forever; naked call to OS-specific sleep; use with CAUTION while (
true) {
// sleep forever ... ::
sleep(
100);
// ... 100 seconds at a time// Used to convert frequent JVM_Yield() to nops // Yields to all threads, including threads with lower priorities // Threads on Linux are all with same priority. The Solaris style // os::yield_all() with nanosleep(1ms) is not necessary. // Called from the tight loops to possibly influence time-sharing heuristics //////////////////////////////////////////////////////////////////////////////// // thread priority support // Note: Normal Linux applications are run with SCHED_OTHER policy. SCHED_OTHER // only supports dynamic priority, static priority must be zero. For real-time // applications, Linux supports SCHED_RR which allows static priority (1-99). // However, for large multi-threaded applications, SCHED_RR is not only slower // than SCHED_OTHER, but also very unstable (my volano tests hang hard 4 out // of 5 runs - Sep 2005). // not the entire user process, and user level threads are 1:1 mapped to kernel // threads. It has always been the case, but could change in the future. For // this reason, the code should not be used as default (ThreadPriorityPolicy=0). // It is only used when ThreadPriorityPolicy=1 and requires root privilege. 19,
// 0 Entry should never be used // Only root can raise thread priority. Don't allow ThreadPriorityPolicy=1 // if effective uid is not root. Perhaps, a more elegant way of doing // this is to test CAP_SYS_NICE capability, but that will require libcap.so warning(
"-XX:ThreadPriorityPolicy requires root privilege on Linux");
// Hint to the underlying OS that a task switch would not be good. // Void return because it's a hint and can fail. //////////////////////////////////////////////////////////////////////////////// // the low-level signal-based suspend/resume support is a remnant from the // old VM-suspension that used to be for java-suspension, safepoints etc, // within hotspot. Now there is a single use-case for this: // - calling get_thread_pc() on the VMThread by the flat-profiler task // that runs in the watcher thread. // The remaining code is greatly simplified from the more general suspension // code that used to be used. // The protocol is quite simple: // - sends a signal to the target thread // - polls the suspend state of the osthread using a yield loop // - target thread signal handler (SR_handler) sets suspend state // and blocks in sigsuspend until continued // - sets target osthread state to continue // - sends signal to end the sigsuspend loop in the SR_handler // Note that the SR_lock plays no role in this suspend/resume protocol. // notify the suspend action is completed, we have now resumed // Handler function invoked when a thread's execution is suspended or // resumed. We have to be careful that only async-safe functions are // called here (Note: most pthread functions are not async safe and // Note: sigwait() is a more natural fit than sigsuspend() from an // interface point of view, but sigwait() prevents the signal hander // from being run. libpthread would get very confused by not having // its signal handlers run and prevents sigwait()'s use with the // mutex granting granting signal. // Currently only ever called on the VMThread // Save and restore errno to avoid confusing native code with EINTR // read current suspend action // Notify the suspend action is about to be completed. do_suspend() // waits until SR_SUSPENDED is set and then returns. We will wait // here for a resume signal and that completes the suspend-other // the same thread - so there are no races // get current set of blocked signals and unblock resume signal // wait here until we are resumed // ignore all returns until we get a resume signal // nothing special to do - just leave the handler if ((s = ::
getenv(
"_JAVA_SR_SIGNUM")) != 0) {
"SR_signum must be greater than max(SIGSEGV, SIGBUS), see 4355769");
// SR_signum is blocked by default. // 4528190 - We also need to block pthread restart signal (32 on all // supported Linux platforms). Note that LinuxThreads need to block // this signal for all threads to work properly. So we don't have // to use hard-coded signal number when setting up the mask. // returns true on success and false on error - really an error is fatal // but this seems the normal response to library errors // mark as suspended and send signal // check status and wait until notified of suspension // check status and wait unit notified of resumption //////////////////////////////////////////////////////////////////////////////// "possibility of dangling Thread pointer");
// More than one thread can get here with the same value of osthread, // resulting in multiple notifications. We do, however, want the store // to interrupted() to be visible to other threads before we execute unpark(). // For JSR166. Unpark even if interrupt status already was set "possibility of dangling Thread pointer");
// consider thread->_SleepEvent->reset() ... optional optimization /////////////////////////////////////////////////////////////////////////////////// // This routine may be used by user applications as a "hook" to catch signals. // The user-defined signal handler must pass unrecognized signals to this // routine, and if it returns true (non-zero), then the signal handler must // return immediately. If the flag "abort_if_unrecognized" is true, then this // routine will never retun false (zero), but instead will execute a VM panic // routine kill the process. // If this routine returns false, it is OK to call it again. This allows // the user-defined signal handler to perform checks either before or after // the VM performs its own checks. Naturally, the user code would be making // a serious error if it tried to handle an exception (such as a null check // or breakpoint) that the VM was generating for its own correct operation. // This routine may recognize any of the following kinds of signals: // SIGBUS, SIGSEGV, SIGILL, SIGFPE, SIGQUIT, SIGPIPE, SIGXFSZ, SIGUSR1. // It should be consulted by handlers for any of those signals. // The caller of this routine must pass in the three arguments supplied // to the function referred to in the "sa_sigaction" (not the "sa_handler") // field of the structure passed to sigaction(). This routine assumes that // the sa_flags field passed to sigaction() includes SA_SIGINFO and SA_RESTART. // Note that the VM will print warnings if it detects conflicting signal // handlers, unless invoked with the option "-XX:+AllowUserSignalHandlers". // This boolean allows users to forward their own non-matching signals // to JVM_handle_linux_signal, harmlessly. // Retrieve the old signal handler from libjsig // Retrieve the preinstalled signal handler from jvm // Call the old signal handler // It's more reasonable to let jvm treat it as an unexpected exception // instead of taking the default action. // automaticlly block the signal // retrieve the chained handler // try to honor the signal mask // call into the chained handler // restore the signal mask // Tell jvm's signal handler the signal is taken care of. if ((( (
unsigned int)
1 <<
sig ) &
sigs) != 0) {
// Do not overwrite; user takes responsibility to forward to us. // save the old handler in jvm // libjsig also interposes the sigaction() call below and saves the // old sigaction on it own. fatal2(
"Encountered unexpected pre-existing sigaction handler %#lx for signal %d.", (
long)
oldhand,
sig);
// Save flags, which are set by ours // install signal handlers for signals that HotSpot needs to // handle in order to support Java-level exception handling. // Tell libjsig jvm is setting signal handlers // Tell libjsig jvm finishes setting signal handlers // We don't activate signal checker if libjsig is in place, we trust ourselves // and if UserSignalHandler is installed all bets are off tty->
print_cr(
"Info: libjsig is activated, all active signal checking is disabled");
tty->
print_cr(
"Info: AllowUserSignalHandlers is activated, all active signal checking is disabled");
// This is the fastest way to get thread cpu time on Linux. // Returns cpu time (user+sys) for any thread, not only for current. // POSIX compliant clocks are implemented in the kernels 2.6.16+. // It might work on 2.6.10+ with a special kernel/glibc patch. // For reference, please, see IEEE Std 1003.1-2004: assert(
rc == 0,
"clock_gettime is expected to return 0 code");
// glibc on Linux platform uses non-documented flag // to indicate, that some special sort of signal // We will never set this flag, and we should // ignore this flag in our diagnostic // See comment for SIGNIFICANT_SIGNAL_MASK define // May be, handler was resetted by VMError? // Check: is it our handler? // It is our signal handler // check for flags, reset system-used one! ", flags was changed from " PTR32_FORMAT ", consider using jsig library",
// This method is a periodic task to check for misbehaving JNI applications // under CheckJNI, we can add any periodic checks here // SEGV and BUS if overridden could potentially prevent // generation of hs*.log in the event of a crash, debugging // such a case can be very challenging, so we absolutely // check the following for a good measure: // ReduceSignalUsage allows the user to override these handlers // only trust the default sigaction, in case it has been interposed // No need to check this sig any longer // No need to check this sig any longer // this is called _before_ the most of global arguments have been parsed char dummy;
/* used to get a guess on initial stack address */ // first_hrtime = gethrtime(); // With LinuxThreads the JavaMain thread pid (primordial thread) // is different than the pid of the java launcher thread. // So, on Linux, the launcher thread pid is passed to the VM // via the sun.java.launcher.pid property. // Use this property instead of getpid() if it was correctly passed. // main_thread points to the aboriginal thread // To install functions for atexit system call // this is called _after_ the global arguments have been parsed // Allocate a single page and mark it as readable for safepoint polling // initialize suspend/resume support - must do this before signal_sets_init() perror(
"SR_initialize failed");
tty->
print_cr(
"\nThe stack size specified is too small, " // Make the stack size a multiple of the page size so that tty->
print_cr(
"[HotSpot is running with %s, %s(%s)]\n",
// There's only one node(they start from 0), disable NUMA. // set the number of file descriptors to max. print out error perror(
"os::init_2 getrlimit failed");
perror(
"os::init_2 setrlimit failed");
// Initialize lock used to serialize thread creation (see os::create_thread) tty->
print_cr(
"There was an error trying to initialize the HPI library.");
// at-exit methods are called in the reverse order of their registration. // atexit functions are called on return from main or as a result of a // call to exit(3C). There can be only 32 of these functions registered // and atexit() does not set errno. // only register atexit functions if PerfAllowAtExitRegistration is set. // atexit functions can be delayed until process exit time, which // can be problematic for embedded VM situations. Embedded VMs should // call DestroyJavaVM() to assure that VM resources are released. // note: perfMemory_exit_helper atexit function may be removed in // the future if the appropriate cleanup code can be added to the // VM_Exit VMOperation's doit method. warning(
"os::init2 atexit(perfMemory_exit_helper) failed");
// initialize thread priority policy // Mark the polling page as unreadable fatal(
"Could not disable polling page");
// Mark the polling page as readable fatal(
"Could not enable polling page");
// Linux doesn't yet have a (official) notion of processor sets, // so just return the number of online processors. // Suspends the target using the signal mechanism and then grabs the PC before // resuming the target. Used by the flat-profiler only // Make sure that it is called by the watcher for the VMThread // NULL context is unexpected, double-check this is the VMThread // failure means pthread_kill failed for some reason - arguably this is // a fatal problem, but such problems are ignored elsewhere // 6292965: LinuxThreads pthread_cond_timedwait() resets FPU control // word back to default 64bit precision if condvar is signaled. Java // wants 53bit precision. Save and restore current value. //////////////////////////////////////////////////////////////////////////////// // decode some bytes around the PC //////////////////////////////////////////////////////////////////////////////// // This does not do anything on Linux. This is basically a hook for being // able to use structured exception handling (thread-local exception filters) // Prevent process from exiting upon "read error" without consuming all CPU return buf[0] ==
'y' ||
buf[0] ==
'Y';
// Is a (classpath) directory empty? // create binary file, rewriting existing file if required // return current position of file pointer // move file pointer to the specified offset // Map a block of memory. // Remap a block of memory. // same as map_memory() on this OS // Unmap a block of memory. assert(
rc == 0,
"pthread_getcpuclockid is expected to return 0 code");
// current_thread_cpu_time(bool) and thread_cpu_time(Thread*, bool) // are used by JVM M&M and JVMTI to get user+sys or user CPU time // current_thread_cpu_time() and thread_cpu_time(Thread*) returns // the fast estimate available on the platform. // return user + sys since the cost is the same // consistent with what current_thread_cpu_time() returns // We first try accessing /proc/<pid>/cpu since this is faster to // process. If this file is not present (linux kernels 2.5 and above) // then we open /proc/<pid>/stat. if (
count !=
3 )
return -
1;
// The /proc/<tid>/stat aggregates per-process usage on // new Linux kernels 2.6+ where NPTL is supported. // The /proc/self/task/<tid>/stat still has the per-thread usage. // There can be no directory /proc/self/task on kernels 2.4 with NPTL // and possibly in some other cases, so we check its availability. // This is executed only once // Skip pid and the command string. Note that we could be dealing with // weird command names, e.g. user could decide to rename java launcher // to "java 1.4.2 :)", then the stat file would look like // 1234 (java 1.4.2 :)) R ... ... // We don't really need to know the command string, just find the last // occurrence of ")" and then start parsing from there. See bug 4726580. if (s ==
NULL )
return -
1;
count =
sscanf(s,
"%*c %d %d %d %d %d %lu %lu %lu %lu %lu %lu %lu",
if (
count !=
12 )
return -
1;
// System loadavg support. Returns -1 if load average cannot be obtained. // Linux doesn't yet have a (official) notion of processor sets, // so just return the system wide load average. "Could not open pause file '%s', continuing immediately.\n",
filename);
* NOTE: the following code is to keep the green threads code * in the libjava.so happy. Once the green threads is removed, * these code will no longer be needed. // Beware -- Some versions of NPTL embody a flaw where pthread_cond_timedwait() can // hang indefinitely. For instance NPTL 0.60 on 2.4.21-4ELsmp is vulnerable. // For specifics regarding the bug see GLIBC BUGID 261237 : // Briefly, pthread_cond_timedwait() calls with an expiry time that's not in the future // will either hang or corrupt the condvar, resulting in subsequent hangs if the condvar // is used. (The simple C test-case provided in the GLIBC bug report manifests the // hang). The JVM is vulernable via sleep(), Object.wait(timo), LockSupport.parkNanos() // and monitorenter when we're using 1-0 locking. All those operations may result in // calls to pthread_cond_timedwait(). Using LD_ASSUME_KERNEL to use an older version // of libpthread avoids the problem, but isn't practical. // 1. Establish a minimum relative wait time. 50 to 100 msecs seems to work. // This is palliative and probabilistic, however. If the thread is preempted // between the call to compute_abstime() and pthread_cond_timedwait(), more // than the minimum period may have passed, and the abstime may be stale (in the // past) resultin in a hang. Using this technique reduces the odds of a hang // but the JVM is still vulnerable, particularly on heavily loaded systems. // 2. Modify park-unpark to use per-thread (per ParkEvent) pipe-pairs instead // of the usual flag-condvar-mutex idiom. The write side of the pipe is set // NDELAY. unpark() reduces to write(), park() reduces to read() and park(timo) // reduces to poll()+read(). This works well, but consumes 2 FDs per extant // 3. Embargo pthread_cond_timedwait() and implement a native "chron" thread // that manages timeouts. We'd emulate pthread_cond_timedwait() by enqueuing // a timeout request to the chron thread and then blocking via pthread_cond_wait(). // This also works well. In fact it avoids kernel-level scalability impediments // on certain platforms that don't handle lots of active pthread_cond_timedwait() // timers in a graceful fashion. // 4. When the abstime value is in the past it appears that control returns // correctly from pthread_cond_timedwait(), but the condvar is left corrupt. // Subsequent timedwait/wait calls may hang indefinitely. Given that, we // can avoid the problem by reinitializing the condvar -- by cond_destroy() // followed by cond_init() -- after all calls to pthread_cond_timedwait(). // It may be possible to avoid reinitialization by checking the return // value from pthread_cond_timedwait(). In addition to reinitializing the // condvar we must establish the invariant that cond_signal() is only called // within critical sections protected by the adjunct mutex. This prevents // cond_signal() from "seeing" a condvar that's in the midst of being // reinitialized or that is corrupt. Sadly, this invariant obviates the // desirable signal-after-unlock optimization that avoids futile context switching. // I'm also concerned that some versions of NTPL might allocate an auxilliary // structure when a condvar is used or initialized. cond_destroy() would // release the helper structure. Our reinitialize-after-timedwait fix // put excessive stress on malloc/free and locks protecting the c-heap. // We currently use (4). See the WorkAroundNTPLTimedWaitHang flag. // It may be possible to refine (4) by checking the kernel and NTPL verisons // and only enabling the work-around for vulnerable environments. // utility to compute the abstime argument to timedwait: // millis is the relative timeout time // abstime will be the absolute timeout time // TODO: replace compute_abstime() with unpackTime() if (
seconds >
50000000) {
// see man cond_timedwait(3T) // Test-and-clear _Event, always leaves _Event set to 0, returns immediately. // Conceptually TryPark() should be equivalent to park(0). guarantee ((v == 0) || (v ==
1),
"invariant") ;
// TODO: assert that _Assoc != NULL or _Assoc == Self // Do this the hard way by blocking ... // for some reason, under 2.7 lwp_cond_wait() may return ETIME ... // Treat this the same as if the wait was interrupted // In theory we could move the ST of 0 into _Event past the unlock(), // but then we'd need a MEMBAR after the ST. if (v != 0)
return OS_OK ;
// We do this the hard way, by blocking the thread. // Consider enforcing a minimum timeout value. // Object.wait(timo) will return because of // Thread.interrupt and object.notify{All} both call Event::set. // That is, we treat thread.interrupt as a special case of notification. // The underlying Solaris implementation, cond_timedwait, admits // JVM from making those visible to Java code. As such, we must // filter out spurious wakeups. We assume all ETIME returns are valid. // TODO: properly differentiate simultaneous notify+interrupt. // In that case, we should propagate the notify to another waiter. // We consume and ignore EINTR and spurious wakeups. // The LD of _Event could have reordered or be satisfied // by a read-aside from this processor's write buffer. // To avoid problems execute a barrier and then // Wait for the thread associated with the event to vacate // Note that we signal() _after dropping the lock for "immortal" Events. // This is safe and avoids a common class of futile wakeups. In rare // circumstances this can cause a thread to return prematurely from // cond_{timed}wait() but the spurious wakeup is benign and the victim will // simply re-test the condition and re-park itself. // ------------------------------------------------------- * The solaris and linux implementations of park/unpark are fairly * conservative for now, but can be improved. They currently use a * Park decrements count if > 0, else does a condvar wait. Unpark * sets count to 1 and signals condvar. Only one thread ever waits * on the condvar. Contention seen when trying to park implies that someone * is unparking you, so don't wait. And spurious returns are fine, so there * is no need to track notifications. * This code is common to linux and solaris and will be moved to a * common place in dolphin. * The passed in time value is either a relative time in nanoseconds * or an absolute time in milliseconds. Either way it has to be unpacked * into suitable seconds and nanoseconds components and stored in the * given timespec structure. * Given time is a 64-bit value and the time_t used in the timespec is only * a signed-32-bit value (except on 64-bit Linux) we have to watch for * overflow if times way in the future are given. Further on Solaris versions * prior to 10 there is a restriction (see cond_timedwait) that the specified * number of seconds, in abstime, is less than current_time + 100,000,000. * As it will be 28 years before "now + 100000000" will overflow we can * ignore overflow and just impose a hard-limit on seconds using the value * of "now + 100,000,000". This places a limit on the timeout of about 3.17 // Optional fast-path check: // Return immediately if a permit is available. // Optional optimization -- avoid state transitions if there's an interrupt pending. // Check interrupt before trying to wait if (
time < 0) {
// don't wait at all // Enter safepoint region // Beware of deadlocks such as 6317397. // The per-thread Parker:: mutex is a classic leaf-lock. // In particular a thread must never block on the Threads_lock while // holding the Parker:: mutex. If safepoints are pending both the // the ThreadBlockInVM() CTOR and DTOR may grab Threads_lock. // Don't wait if cannot get lock since interference arises from // unblocking. Also. check interrupt before trying wait // Don't catch signals while blocked; let the running threads have the signals. // (This allows a debugger to break into the running thread.) // cleared by handle_special_suspend_equivalent_condition() or java_suspend_self() // If externally suspended while waiting, re-suspend // Run the specified command in a separate process. Return its exit value, // or -1 on failure (e.g. can't fork a new process). // Unlike system(), this function can be called from signal handler. It // doesn't block SIGINT et al. // pthread_atfork handlers and reset pthread library. All we need is a // separate process to execve. Make a direct syscall to fork process. // On IA64 there's no fork syscall, we have to use fork() and hope for // execve() in LinuxThreads will call pthread_kill_other_threads_np() // first to kill every thread on the thread list. Because this list is // not reset by fork() (see notes above), execve() will instead kill // every thread in the parent process. We know this is the only thread // in the new process, so make a system call directly. // IA64 should use normal execve() from glibc to match the glibc fork() // care about the actual exit code, for now. // Wait for the child process to exit. This returns immediately if // the child has already exited. */ // The child exited normally; get its exit code. // The child exited because of a signal // The best value to return is 0x80 + signal number, // because that is what all Unix shells do, and because // it allows callers to distinguish between process exit and // process death by signal. // Unknown exit code; pass it through