os_linux.cpp revision 579
1N/A * Copyright 1999-2009 Sun Microsystems, Inc. All Rights Reserved. 1N/A * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 1N/A * This code is free software; you can redistribute it and/or modify it 1N/A * under the terms of the GNU General Public License version 2 only, as 1N/A * published by the Free Software Foundation. 1N/A * This code is distributed in the hope that it will be useful, but WITHOUT 1N/A * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 1N/A * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 1N/A * version 2 for more details (a copy is included in the LICENSE file that 1N/A * accompanied this code). 1N/A * You should have received a copy of the GNU General Public License version 1N/A * 2 along with this work; if not, write to the Free Software Foundation, 1N/A * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 1N/A * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, 1N/A * CA 95054 USA or visit www.sun.com if you need additional information or 1N/A * have any questions. 1N/A// do not include precompiled header file 1N/A// put OS-includes here 1N/A// for timer info max values which include all bits 1N/A//////////////////////////////////////////////////////////////////////////////// 1N/A// For diagnostics to print a message once. see run_periodic_checks 1N/A/* do not use any signal number less than SIGSEGV, see 4355769 */ 1N/A/* Used to protect dlsym() calls */ 1N/A//////////////////////////////////////////////////////////////////////////////// 1N/A // values in struct sysinfo are "unsigned long" 1N/A // See comments under solaris for alignment considerations 1N/A//////////////////////////////////////////////////////////////////////////////// 1N/A// environment support 1N/A if (
len > 0)
buf[0] = 0;
// return a null string 1N/A// Return true if user is running as root. 1N/A// i386: 224, ia64: 1105, amd64: 186, sparc 143 1N/A// Cpu architecture string 1N/A// Returns the kernel thread id of the currently running thread. Kernel 1N/A// thread id is used to access /proc. 1N/A// (Note that getpid() on LinuxThreads returns kernel thread id too; but 1N/A// on NPTL, it returns the same pid for all threads, as required by POSIX.) 1N/A // old kernel, no NPTL support 1N/A// Most versions of linux have a bug where the number of processors are 1N/A// determined by looking at the /proc file system. In a chroot environment, 1N/A// the system call returns 1. This causes the VM to act as if it is 1N/A// a single processor and elide locking (see is_MP() call). 1N/A "Java may be unstable running multithreaded in a chroot " 1N/A "environment on Linux when /proc filesystem is not mounted.";
1N/A// sysinfo(SI_ARCHITECTURE, arch, sizeof(arch)); 1N/A // The next steps are taken in the product version: 1N/A // Obtain the JAVA_HOME value from the location of libjvm[_g].so. 1N/A // This library should be located at: 1N/A // <JAVA_HOME>/jre/lib/<arch>/{client|server}/libjvm[_g].so. 1N/A // If "/jre/lib/" appears at the right place in the path, then we 1N/A // assume libjvm[_g].so is installed in a JDK and we use this path. 1N/A // Otherwise exit with message: "Could not create the Java virtual machine." 1N/A // The following extra steps are taken in the debugging version: 1N/A // If "/jre/lib/" does NOT appear at the right place in the path 1N/A // instead of exit check for $JAVA_HOME environment variable. 1N/A // it looks like libjvm[_g].so is installed there 1N/A // Important note: if the location of libjvm.so changes this 1N/A // code needs to be changed accordingly. 1N/A // The next few definitions allow the code to be verbatim: 1N/A * The linker uses the following search paths to locate required 1N/A * 7: The default directories, normally /lib and /usr/lib. 1N/A /* sysclasspath, java_home, dll_dir */ 1N/A // Now cut the path to <java_home>/jre if we can. 1N/A *
pslash =
'\0';
/* get rid of /{client|server|hotspot} */ 1N/A * Where to look for native libraries 1N/A * Note: Due to a legacy implementation, most of the library path 1N/A * is set in the launcher. This was to accomodate linking restrictions 1N/A * on legacy Linux implementations (which are no longer supported). 1N/A * Eventually, all the library path setting will be done here. 1N/A * However, to prevent the proliferation of improperly built native 1N/A * Eventually, all the library path setting will be done here. 1N/A * Construct the invariant part of ld_library_path. Note that the 1N/A * space for the colon and the trailing null are provided by the 1N/A * nulls included by the sizeof operator (so actually we allocate 1N/A * a byte more than necessary). 1N/A * Get the user setting of LD_LIBRARY_PATH, and prepended it. It 1N/A * should always exist (until the legacy problem cited above is 1N/A /* That's +1 for the colon and +1 for the trailing '\0' */ 1N/A * Extensions directories. 1N/A * Note that the space for the colon and the trailing null are provided 1N/A * by the nulls included by the sizeof operator (so actually one byte more 1N/A * than necessary is allocated). 1N/A /* Endorsed standards default directory. */ 1N/A//////////////////////////////////////////////////////////////////////////////// 1N/A// breakpoint support 1N/A // use debugger to set breakpoint here 1N/A//////////////////////////////////////////////////////////////////////////////// 1N/A // Should also have an assertion stating we are still single-threaded. 1N/A // Fill in signals that are necessarily unblocked for all threads in 1N/A // the VM. Currently, we unblock the following signals: 1N/A // SHUTDOWN{1,2,3}_SIGNAL: for shutdown hooks support (unless over-ridden 1N/A // by -Xrs (=ReduceSignalUsage)); 1N/A // BREAK_SIGNAL which is unblocked only by the VM thread and blocked by all 1N/A // other threads. The "ReduceSignalUsage" boolean tells us not to alter 1N/A // the dispositions or masks wrt these signals. 1N/A // Programs embedding the VM that want to use the above signals for their 1N/A // own purposes must, at this time, use the "-Xrs" option to prevent 1N/A // interference with shutdown hooks and BREAK_SIGNAL thread dumping. 1N/A // (See bug 4345157, and other related bugs). 1N/A // In reality, though, unblocking these signals is really a nop, since 1N/A // these signals are not blocked by default. 1N/A // Fill in signals that are blocked by all but the VM thread. 1N/A// These are signals that are unblocked while a thread is running Java. 1N/A// (For some reason, they get blocked by default.) 1N/A// These are the signals that are blocked while a (non-VM) thread is 1N/A// running Java. Only the VM thread handles these signals. 1N/A// These are signals that are blocked during cond_wait to allow debugger in 1N/A //Save caller's signal mask before setting VM signal mask 1N/A // Only the VM thread handles BREAK_SIGNAL ... 1N/A // ... all other threads block BREAK_SIGNAL ////////////////////////////////////////////////////////////////////////////// // detecting pthread library // Save glibc and pthread version strings. Note that _CS_GNU_LIBC_VERSION // and _CS_GNU_LIBPTHREAD_VERSION are supported in glibc >= 2.3.2. Use a // generic name for earlier versions. // Define macros here so we can build HotSpot on old systems. // _CS_GNU_LIBC_VERSION is not supported, try gnu_get_libc_version() // Vanilla RH-9 (glibc 2.3.2) has a bug that confstr() always tells // us "NPTL-0.29" even we are running with LinuxThreads. Check if this // is the case. LinuxThreads has a hard limit on max number of threads. // So sysconf(_SC_THREAD_THREADS_MAX) will return a positive value. // On the other hand, NPTL does not have such a limit, sysconf() // will return -1 and errno is not changed. Check if it is really NPTL. // glibc before 2.3.2 only has LinuxThreads. // LinuxThreads have two flavors: floating-stack mode, which allows variable // stack size; and fixed-stack mode. NPTL is always floating-stack. ///////////////////////////////////////////////////////////////////////////// // Force Linux kernel to expand current thread stack. If "bottom" is close // to the stack guard, caller should block all signals. // A special mmap() flag that is used to implement thread stacks. It tells // kernel that the memory region should extend downwards when needed. This // allows early versions of LinuxThreads to only mmap the first few pages // when creating a new thread. Linux kernel will automatically expand thread // stack as needed (on page faults). // However, because the memory region of a MAP_GROWSDOWN stack can grow on // demand, if a page fault happens outside an already mapped MAP_GROWSDOWN // region, it's hard to tell if the fault is due to a legitimate stack // overrun). As a rule, if the fault happens below current stack pointer, // Linux kernel does not expand stack, instead a SIGSEGV is sent to the // application (see Linux kernel fault.c). // This Linux feature can cause SIGSEGV when VM bangs thread stack for // stack overflow detection. // Newer version of LinuxThreads (since glibc-2.2, or, RH-7.x) and NPTL do // not use this flag. However, the stack of initial thread is not created // by pthread, it is still MAP_GROWSDOWN. Also it's possible (though // unlikely) that user code can create a thread with MAP_GROWSDOWN stack // and then attach the thread to JVM. // To get around the problem and allow stack banging on Linux, we need to // manually expand thread stack after receiving the SIGSEGV. // There are two ways to expand thread stack to address "bottom", we used // both of them in JVM before 1.5: // 1. adjust stack pointer first so that it is below "bottom", and then // 2. mmap() the page in question // Now alternate signal stack is gone, it's harder to use 2. For instance, // if current sp is already near the lower end of page 101, and we need to // call mmap() to map page 100, it is possible that part of the mmap() frame // will be placed in page 100. When page 100 is mapped, it is zero-filled. // That will destroy the mmap() frame and cause VM to crash. // The following code works by adjusting sp first, then accessing the "bottom" // page to force a page fault. Linux kernel will then automatically expand the // _expand_stack_to() assumes its frame size is less than page size, which // should always be true if the function is not inlined. #
if __GNUC__ <
3 // gcc 2.x does not support noinline attribute // Adjust bottom to point to the largest address within the same page, it // gives us a one-page buffer if alloca() allocates slightly more memory. // sp might be slightly above current stack pointer; if that's the case, we // will alloca() a little more space than necessary, which is OK. Don't use // os::current_stack_pointer(), as its result can be slightly below current // stack pointer, causing us to not alloca enough to reach "bottom". ////////////////////////////////////////////////////////////////////////////// // check if it's safe to start a new thread // Fixed stack LinuxThreads (SuSE Linux/x86, and some versions of Redhat) // Heap is mmap'ed at lower end of memory space. Thread stacks are // allocated (MAP_FIXED) from high address space. Every thread stack // occupies a fixed size slot (usually 2Mbytes, but user can change // it to other values if they rebuild LinuxThreads). // Problem with MAP_FIXED is that mmap() can still succeed even part of // the memory region has already been mmap'ed. That means if we have too // many threads and/or very large heap, eventually thread stack will // Here we try to prevent heap/stack collision by comparing current // stack bottom with the highest address that has been mmap'ed by JVM // plus a safety margin for memory maps created by native code. // This feature can be disabled by setting ThreadSafetyMargin to 0 // not safe if our stack extends below the safety margin // Floating stack LinuxThreads or NPTL: // Unlike fixed stack LinuxThreads, thread stacks are not MAP_FIXED. When // there's not enough space left, pthread_create() will fail. If we come // here, that means enough space has been reserved for stack. // Thread start routine for all newly created threads // Try to randomize the cache line index of hot stack frames. // This helps when threads of the same stack traces evict each other's // cache lines. The threads can be either from the same JVM instance, or // from different JVM instances. The benefit is especially true for // processors with hyperthreading technology. // non floating stack LinuxThreads needs extra check, see above // thread_id is kernel thread id (similar to Solaris LWP id) // initialize signal mask for this thread // initialize floating point control register // handshaking with parent thread // wait until os::start_thread() // call one more level start routine // Allocate the OSThread object // set the correct thread state // Initial state is ALLOCATED but not INITIALIZED // init thread attributes // calculate stack size if it's not specified by caller // Java threads use ThreadStackSize which default value can be changed with the flag -Xss // use VMThreadStackSize if CompilerThreadStackSize is not defined // let pthread_create() pick the default value. // Serialize thread creation if we are running with fixed stack LinuxThreads // Need to clean up stuff we've allocated so far // Store pthread info into the OSThread // Wait until child thread is either initialized or aborted // Aborted due to thread limit being reached // The thread is returned suspended (in state INITIALIZED), // and is started higher up in the call chain ///////////////////////////////////////////////////////////////////////////// // attach existing thread // bootstrap the main thread // Allocate the OSThread object // Store pthread info into the OSThread // initialize floating point control register // Initial thread state is RUNNABLE // If current thread is initial thread, its stack is mapped on demand, // see notes about MAP_GROWSDOWN. Here we try to force kernel to map // the entire stack region to avoid SEGV in stack banging. // It is also useful to get around the heap-stack-gap problem on SuSE // kernel (see 4821821 for details). We first expand stack to the top // of yellow zone, then enable stack yellow zone (order is significant, // enabling yellow zone first will crash JVM on SuSE Linux), so there // is no gap between the last two virtual memory regions. // initialize signal mask for this thread // and save the caller's signal mask // Free Linux resources related to the OSThread // Restore caller's signal mask ////////////////////////////////////////////////////////////////////////////// assert(
rslt == 0,
"cannot allocate thread local storage");
// Note: This is currently not used by VM, as we don't destroy TLS key ////////////////////////////////////////////////////////////////////////////// // Check if current thread is the initial thread, similar to Solaris thr_main. // If called before init complete, thread stack bottom will be null. // Can be called if fatal error occurs before initialization. "os::init did not locate initial thread's stack region");
// Find the virtual memory area that contains addr if (
ch ==
EOF ||
ch == (
int)
'\n')
break;
// Locate initial thread stack. This special handling of initial thread stack // is needed because pthread_getattr_np() on most (all?) Linux distros returns // bogus value for initial thread. // stack size is the easy part, get it from RLIMIT_STACK // 6308388: a bug in ld.so will relocate its own .data section to the // lower end of primordial stack; reduce ulimit -s value a little bit // so we won't install guard page on ld.so's data section. // 4441425: avoid crash with "unlimited" stack size on SuSE 7.1 or Redhat // 7.1, in both cases we will get 2G in return value. // 4466587: glibc 2.2.x compiled w/o "--enable-kernel=2.4.0" (RH 7.0, // SuSE 7.2, Debian) can not handle alternate signal stack correctly // for initial thread if its stack size exceeds 6M. Cap it at 2M, // in case other parts in glibc still assumes 2M max stack size. // FIXME: alt signal stack is gone, maybe we can relax this constraint? // Problem still exists RH7.2 (IA64 anyway) but 2MB is a little small // Try to figure out where the stack base (top) is. This is harder. // When an application is started, glibc saves the initial stack pointer in // a global variable "__libc_stack_end", which is then used by system // libraries. __libc_stack_end should be pretty close to stack top. The // variable is available since the very early days. However, because it is // a private interface, it could disappear in the future. // Linux kernel saves start_stack information in /proc/<pid>/stat. Similar // to __libc_stack_end, it is very close to stack top, but isn't the real // stack top. Note that /proc may not exist if VM is running as a chroot // program, so reading /proc/<pid>/stat could fail. Also the contents of // /proc/<pid>/stat could change in the future (though unlikely). // We try __libc_stack_end first. If that doesn't work, look for // /proc/<pid>/stat. If neither of them works, we use current stack pointer // as a hint, which should work well in most cases. // try __libc_stack_end first // see if we can get the start_stack field from /proc/self/stat // Figure what the primordial thread stack base is. Code is inspired // by email from Hans Boehm. /proc/self/stat begins with current pid, // followed by command name surrounded by parentheses, state, etc. // Skip pid and the command string. Note that we could be dealing with // weird command names, e.g. user could decide to rename java launcher // to "java 1.4.2 :)", then the stat file would look like // 1234 (java 1.4.2 :)) R ... ... // We don't really need to know the command string, just find the last // occurrence of ")" and then start parsing from there. See bug 4726580. /* 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 */ /* 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 */ i =
sscanf(s,
"%c %d %d %d %d %d %lu %lu %lu %lu %lu %lu %lu %ld %ld %ld %ld %ld %ld " &
start,
/* 22 UINTX_FORMAT */ &
vsize,
/* 23 UINTX_FORMAT */ &
rss,
/* 24 UINTX_FORMAT */ &
scodes,
/* 26 UINTX_FORMAT */ &
ecode,
/* 27 UINTX_FORMAT */ // product mode - assume we are the initial thread, good luck in the warning(
"Can't detect initial thread stack location - bad conversion");
// For some reason we can't open /proc/self/stat (for example, running on // FreeBSD with a Linux emulator, or inside chroot), this should work for // most cases, so don't abort: // Now we have a pointer (stack_start) very close to the stack top, the // next thing to do is to figure out the exact location of stack top. We // can find out the virtual memory area that contains stack_start by // and its upper limit is the real stack top. (again, this would fail if // running inside chroot, because /proc may not exist.) // success, "high" is the true stack top. (ignore "low", because initial // thread stack grows on demand, its real bottom is high - RLIMIT_STACK.) warning(
"Can't detect initial thread stack location - find_vma failed");
// best effort: stack_start is normally within a few pages below the real // stack top, use it as stack top, and reduce stack size so we won't put // guard page outside stack. // stack_top could be partially down the page so align it //////////////////////////////////////////////////////////////////////////////// // Time since start-up in seconds to a fine granularity. // Used by VMSelfDestructTimer and the MemProfiler. // For now, we say that linux does not support vtime. I have no idea // whether it can actually be made to (DLD, 9/13/05). // better than nothing, but not much // we do dlopen's in this particular order due to bug in linux // dynamical loader (see 6348968) leading to crash on exit // See if monotonic clock is supported by the kernel. Note that some // early implementations simply return kernel jiffies (updated every // 1/100 or 1/1000 second). It would be bad to use such a low res clock // for nano time (though the monotonic property is still nice to have). // It's fixed in newer kernels, however clock_getres() still returns // 1/HZ. We check if clock_getres() works, but will ignore its reported // resolution for now. Hopefully as people move to new kernels, this // yes, monotonic clock is supported // close librt if there is no monotonic clock // Switch to using fast clocks for thread cpu time if // the sys_clock_getres() returns 0 error code. // Note, that some kernels may support the current thread // clock (CLOCK_THREAD_CPUTIME_ID) but not the clocks // returned by the pthread_getcpuclockid(). // If the fast Posix clocks are supported then the sys_clock_getres() // must return at least tp.tv_sec == 0 which means a resolution // better than 1 sec. This is extra check for reliability. // CLOCK_MONOTONIC - amount of time since some arbitrary point in the past // gettimeofday - based on time in seconds since the Epoch thus does not wrap // gettimeofday is a real time clock so it skips // Return the real, user, and system times in seconds from an // arbitrary fixed point in the past. //////////////////////////////////////////////////////////////////////////////// // Note: os::shutdown() might be called very early during initialization, or // called from signal handler. Before adding something to os::shutdown(), make // sure it is async-safe and can handle partially initialized VM. // allow PerfMemory to attempt cleanup of any persistent resources // needs to remove object in file system // flush buffered output, finish log files // Note: os::abort() might be called very early during initialization, or // called from signal handler. Before adding something to os::abort(), make // sure it is async-safe and can handle partially initialized VM. // Die immediately, no exit hook, no abort hook, no cleanup. // _exit() on LinuxThreads only kills current thread // unused on linux for now. // Under the old linux thread library, linux gives each thread // its own process id. Because of this each thread will return // a different pid if this method were to return the result // of getpid(2). Linux provides no api that returns the pid // of the launcher thread for the vm. This implementation // returns a unique pid, the pid of the launcher thread // that starts the vm 'process'. // Under the NPTL, getpid() returns the same pid as the // launcher thread rather than a unique pid per thread. // Use gettid() if you want the old pre NPTL behaviour. // if you are looking for the result of a call to getpid() that // returns a unique pid for the calling thread, then look at the /* Quietly truncate on buffer overflow. Should be an error. */ // check if addr is inside libjvm[_g].so char*
fname;
// output: library name // iterate through all loadable segments // base address of a library is the lowest address of its loaded // see if 'addr' is within current segment // dlpi_name is NULL or empty if the ELF file is executable, return 0 // so dll_address_to_library_name() can fall through to use dladdr() which // can figure out executable name from argv[0]. // There is a bug in old glibc dladdr() implementation that it could resolve // to wrong library name if the .so file has a base address != NULL. Here // we iterate through the program headers of all loaded libraries to find // out which library 'addr' really belongs to. This workaround can be // removed once the minimum requirement for glibc is moved to 2.3.x. // buf already contains library name // in case of error it checks if .dll/.so was built for the // same architecture as Hotspot is running on // Read system error message into ebuf // It may or may not be overwritten below // No more space in ebuf for additional diagnostics message // Can't open library, report dlerror() message // file i/o error - report dlerror() msg char*
name;
// String representation #
define EM_486 6 /* Intel 80486 */ // Identify compatability class for VM's architecture and library's architecture // Obtain string descriptions for architectures "Didn't find running architecture code (running_arch_code) in arch_array");
// Even though running architecture detection failed // we may still continue with reporting dlerror() message " (Possible cause: can't load %s-bit .so on a %s-bit platform)",
" (Possible cause: can't load this .so (machine code=0x%x) on a %s-bit platform)",
* glibc-2.0 libdl is not MT safe. If you are building with any glibc, * chances are you might want to run the generated bits against glibc-2.0 * libdl.so, so always use locking for any version of glibc. st->
print(
"Can not get library information for pid = %d\n",
pid);
// Try to identify popular distros. // Most Linux distributions have /etc/XXX-release file, which contains // the OS version string. Some have more than one /etc/XXX-release file // so the order is important. // Print warning if unsafe chroot environment detected // values in struct sysinfo are "unsigned long" // but they're the same for all the linux arch that we support // and they're the same for solaris but there's no common place to put this. const char *
ill_names[] = {
"ILL0",
"ILL_ILLOPC",
"ILL_ILLOPN",
"ILL_ILLADR",
"ILL_ILLTRP",
"ILL_PRVOPC",
"ILL_PRVREG",
"ILL_COPROC",
"ILL_BADSTK" };
const char *
fpe_names[] = {
"FPE0",
"FPE_INTDIV",
"FPE_INTOVF",
"FPE_FLTDIV",
"FPE_FLTOVF",
"FPE_FLTUND",
"FPE_FLTRES",
"FPE_FLTINV",
"FPE_FLTSUB",
"FPE_FLTDEN" };
const char *
segv_names[] = {
"SEGV0",
"SEGV_MAPERR",
"SEGV_ACCERR" };
const char *
bus_names[] = {
"BUS0",
"BUS_ADRALN",
"BUS_ADRERR",
"BUS_OBJERR" };
assert(c > 0,
"unexpected si_code");
st->
print(
"\n\nError accessing class data sharing archive." \
" Mapped file inaccessible during execution, " \
assert(
false,
"must use a large-enough buffer");
// Lazy resolve the path to current module. // Support for the gamma launcher. Typical value for buf is // the right place in the string, then assume we are installed in a JDK and // we're done. Otherwise, check for a JAVA_HOME environment variable and fix // up the path so it looks like libjvm.so is installed there (append a for (--p; p >
buf && *p !=
'/'; --p)
// Look for JAVA_HOME in the environment. p =
strstr(p,
"_g") ?
"_g" :
"";
// Use current module name "libjvm[_g].so" instead of // "libjvm"debug_only("_g")".so" since for fastdebug version // we should have "libjvm.so" but debug_only("_g") adds "_g"! // It is used when we are choosing the HPI library's name // "libhpi[_g].so" in hpi::initialize_get_interface(). // Go back to path of .so // no prefix required, not even "_" //////////////////////////////////////////////////////////////////////////////// // sun.misc.Signal support // 4511530 - sem_post is serialized and handled by the manager thread. When // the program is interrupted by Ctrl-C, SIGINT is sent to every thread. We // don't want to flood the manager thread with sem_post requests. // Ctrl-C is pressed during error reporting, likely because the error // handler fails to abort. Let VM die immediately. // -1 means registration failed * The following code is moved from os.cpp for making this * code platform specific, which it is by its very nature. // Will be modified when max signal is changed to be dynamic // a counter for each possible signal value // Linux(POSIX) specific hand shaking semaphore. // Initialize signal structures // Initialize signal semaphore for (
int i = 0; i <
NSIG +
1; i++) {
// cleared by handle_special_suspend_equivalent_condition() or java_suspend_self() // were we externally suspended while we were waiting? // The semaphore has been incremented, but while we were waiting // another thread suspended us. We don't want to continue running // while suspended because that would surprise the thread that //////////////////////////////////////////////////////////////////////////////// // Seems redundant as all get out // Solaris allocates memory by pages. // Rationale behind this function: // current (Mon Apr 25 20:12:18 MSD 2005) oprofile drops samples without executable // mapping for address (see lookup_dcookie() in the kernel module), thus we cannot get // samples for JITted code. Here we create private executable mapping over the code cache // and then we can use standard (well, almost, as mapping can change) way to provide // info for the reporting script by storing timestamp and location of symbol // NOTE: Linux kernel does not really reserve the pages for us. // All it does is to check if there are enough free pages // left at the time of mmap(). This could be a potential // sched_getcpu() should be in libc. // Create a cpu -> node mapping // rebuild_cpu_to_node_map() constructs a table mapping cpud id to node id. // The table is later used in get_node_by_cpu(). const size_t NCPUS =
32768;
// Since the buffer size computation is very obscure // in libnuma (possible values are starting from 16, // and continuing up with every other power of 2, but less // than the maximum number of CPUs supported by kernel), and // is a subject to change (in libnuma version 2 the requirements // are more reasonable) we'll just hardcode the number they use // If 'fixed' is true, anon_mmap() will attempt to reserve anonymous memory // at 'requested_addr'. If there are existing memory mappings at the same // location, however, they will be overwritten. If 'fixed' is false, // 'requested_addr' is only treated as a hint, the return value may or // may not start from the requested address. Unlike Linux mmap(), this // function returns NULL to indicate failure. // anon_mmap() should only get called during VM initialization, // don't need lock (actually we can skip locking even it can be called // from multiple threads, because _highest_vm_reserved_address is just a // hint about the upper limit of non-stack memory regions.) // Don't update _highest_vm_reserved_address, because there might be memory // regions above addr + size. If so, releasing a memory region only creates // a hole in the address space, it doesn't help prevent heap-stack collision. // Linux wants the mprotect address argument to be page aligned. // According to SUSv3, mprotect() should only be used with mappings // established by mmap(), and mmap() always maps whole pages. Unaligned // 'addr' likely indicates problem in the VM (e.g. trying to change // protection of malloc'ed or statically allocated memory). Check the // caller if you hit this assert. // Set protections specified // is_committed is unused. // large_page_size on Linux is used to round up heap size. x86 uses either // 2M or 4M page, depending on whether PAE (Physical Address Extensions) // mode is enabled. AMD64/EM64T uses 2M page in 64bit mode. IA64 can use // page as large as 256M. // Here we try to figure out page size by parsing /proc/meminfo and looking // for a line with the following format: // If we can't determine the value (e.g. /proc is not mounted, or the text // format has been changed), we'll use the largest page size supported by if (
fscanf(
fp,
"Hugepagesize: %d", &x) ==
1) {
if (
ch ==
EOF ||
ch == (
int)
'\n')
break;
// Large page support is available on 2.6 or newer kernel, some vendors // (e.g. Redhat) have backported it to their 2.4 based distributions. // We optimistically assume the support is available. If later it turns out // not true, VM will automatically switch to use regular page size. // Create a large shared memory region to attach to based on size. // Currently, size is the total size of the heap // Possible reasons for shmget failure: // 1. shmmax is too small for Java heap. // 2. not enough large page memory. // > increase amount of large pages: // Note 1: different Linux may use different name for this property, // e.g. on Redhat AS-3 it is "hugetlb_pool". // Note 2: it's possible there's enough physical memory available but // they are so fragmented after a long run that they can't // coalesce into large pages. Try to reserve large pages when // the system is still "fresh". // Remove shmid. If shmat() is successful, the actual shared memory segment // will be deleted when it's detached by shmdt() or when the process // terminates. If shmat() is not successful this will remove the shared // detaching the SHM segment will also delete it, see reserve_memory_special() // Linux does not support anonymous mmap with large page memory. The only way // to reserve large page memory without file backing is through SysV shared // memory API. The entire memory region is committed and pinned upfront. // Hopefully this will change in the future... // Reserve memory at an arbitrary address, only if that area is // available (and not reserved for something else). // Assert only that the size is a multiple of the page size, since // that's all that mmap requires, and since that's all we really know // about at this low abstraction level. If we need higher alignment, // we can either pass an alignment to this method or verify alignment // in one of the methods further up the call chain. See bug 5044738. // Repeatedly allocate blocks until the block is allocated at the // right spot. Give up after max_tries. Note that reserve_memory() will // automatically update _highest_vm_reserved_address if the call is // successful. The variable tracks the highest memory address every reserved // by JVM. It is used to detect heap-stack collision if running with // fixed-stack LinuxThreads. Because here we may attempt to reserve more // space than needed, it could confuse the collision detecting code. To // solve the problem, save current _highest_vm_reserved_address and // calculate the correct value before return. // Linux mmap allows caller to pass an address as hint; give it a try first, // if kernel honors the hint then we can return immediately. // mmap() is successful but it fails to reserve at the requested address // Is this the block we wanted? // Does this overlap the block we wanted? Give back the overlapped // Give back the unused reserved pieces. for (
int j = 0; j < i; ++j) {
// TODO-FIXME: reconcile Solaris' os::sleep with the linux variation. // Solaris uses poll(), linux uses park(). // Poll() is likely a better choice, assuming that Thread.interrupt() // generates a SIGUSRx signal. Note that SIGUSR1 can interfere with // time moving backwards, should only happen if no monotonic clock // not a guarantee() because JVM should not abort on kernel/glibc bugs // cleared by handle_special_suspend_equivalent_condition() or // java_suspend_self() via check_and_wait_while_suspended() // were we externally suspended while we were waiting? // It'd be nice to avoid the back-to-back javaTimeNanos() calls on // time moving backwards, should only happen if no monotonic clock // not a guarantee() because JVM should not abort on kernel/glibc bugs // %% make the sleep time an integer flag. for now use 1 millisec. // Sleep forever; naked call to OS-specific sleep; use with CAUTION while (
true) {
// sleep forever ... ::
sleep(
100);
// ... 100 seconds at a time// Used to convert frequent JVM_Yield() to nops // Yields to all threads, including threads with lower priorities // Threads on Linux are all with same priority. The Solaris style // os::yield_all() with nanosleep(1ms) is not necessary. // Called from the tight loops to possibly influence time-sharing heuristics //////////////////////////////////////////////////////////////////////////////// // thread priority support // Note: Normal Linux applications are run with SCHED_OTHER policy. SCHED_OTHER // only supports dynamic priority, static priority must be zero. For real-time // applications, Linux supports SCHED_RR which allows static priority (1-99). // However, for large multi-threaded applications, SCHED_RR is not only slower // than SCHED_OTHER, but also very unstable (my volano tests hang hard 4 out // of 5 runs - Sep 2005). // not the entire user process, and user level threads are 1:1 mapped to kernel // threads. It has always been the case, but could change in the future. For // this reason, the code should not be used as default (ThreadPriorityPolicy=0). // It is only used when ThreadPriorityPolicy=1 and requires root privilege. 19,
// 0 Entry should never be used // Only root can raise thread priority. Don't allow ThreadPriorityPolicy=1 // if effective uid is not root. Perhaps, a more elegant way of doing // this is to test CAP_SYS_NICE capability, but that will require libcap.so warning(
"-XX:ThreadPriorityPolicy requires root privilege on Linux");
// Hint to the underlying OS that a task switch would not be good. // Void return because it's a hint and can fail. //////////////////////////////////////////////////////////////////////////////// // the low-level signal-based suspend/resume support is a remnant from the // old VM-suspension that used to be for java-suspension, safepoints etc, // within hotspot. Now there is a single use-case for this: // - calling get_thread_pc() on the VMThread by the flat-profiler task // that runs in the watcher thread. // The remaining code is greatly simplified from the more general suspension // code that used to be used. // The protocol is quite simple: // - sends a signal to the target thread // - polls the suspend state of the osthread using a yield loop // - target thread signal handler (SR_handler) sets suspend state // and blocks in sigsuspend until continued // - sets target osthread state to continue // - sends signal to end the sigsuspend loop in the SR_handler // Note that the SR_lock plays no role in this suspend/resume protocol. // notify the suspend action is completed, we have now resumed // Handler function invoked when a thread's execution is suspended or // resumed. We have to be careful that only async-safe functions are // called here (Note: most pthread functions are not async safe and // Note: sigwait() is a more natural fit than sigsuspend() from an // interface point of view, but sigwait() prevents the signal hander // from being run. libpthread would get very confused by not having // its signal handlers run and prevents sigwait()'s use with the // mutex granting granting signal. // Currently only ever called on the VMThread // Save and restore errno to avoid confusing native code with EINTR // read current suspend action // Notify the suspend action is about to be completed. do_suspend() // waits until SR_SUSPENDED is set and then returns. We will wait // here for a resume signal and that completes the suspend-other // the same thread - so there are no races // get current set of blocked signals and unblock resume signal // wait here until we are resumed // ignore all returns until we get a resume signal // nothing special to do - just leave the handler if ((s = ::
getenv(
"_JAVA_SR_SIGNUM")) != 0) {
"SR_signum must be greater than max(SIGSEGV, SIGBUS), see 4355769");
// SR_signum is blocked by default. // 4528190 - We also need to block pthread restart signal (32 on all // supported Linux platforms). Note that LinuxThreads need to block // this signal for all threads to work properly. So we don't have // to use hard-coded signal number when setting up the mask. // returns true on success and false on error - really an error is fatal // but this seems the normal response to library errors // mark as suspended and send signal // check status and wait until notified of suspension // check status and wait unit notified of resumption //////////////////////////////////////////////////////////////////////////////// "possibility of dangling Thread pointer");
// More than one thread can get here with the same value of osthread, // resulting in multiple notifications. We do, however, want the store // to interrupted() to be visible to other threads before we execute unpark(). // For JSR166. Unpark even if interrupt status already was set "possibility of dangling Thread pointer");
// consider thread->_SleepEvent->reset() ... optional optimization /////////////////////////////////////////////////////////////////////////////////// // This routine may be used by user applications as a "hook" to catch signals. // The user-defined signal handler must pass unrecognized signals to this // routine, and if it returns true (non-zero), then the signal handler must // return immediately. If the flag "abort_if_unrecognized" is true, then this // routine will never retun false (zero), but instead will execute a VM panic // routine kill the process. // If this routine returns false, it is OK to call it again. This allows // the user-defined signal handler to perform checks either before or after // the VM performs its own checks. Naturally, the user code would be making // a serious error if it tried to handle an exception (such as a null check // or breakpoint) that the VM was generating for its own correct operation. // This routine may recognize any of the following kinds of signals: // SIGBUS, SIGSEGV, SIGILL, SIGFPE, SIGQUIT, SIGPIPE, SIGXFSZ, SIGUSR1. // It should be consulted by handlers for any of those signals. // The caller of this routine must pass in the three arguments supplied // to the function referred to in the "sa_sigaction" (not the "sa_handler") // field of the structure passed to sigaction(). This routine assumes that // the sa_flags field passed to sigaction() includes SA_SIGINFO and SA_RESTART. // Note that the VM will print warnings if it detects conflicting signal // handlers, unless invoked with the option "-XX:+AllowUserSignalHandlers". // This boolean allows users to forward their own non-matching signals // to JVM_handle_linux_signal, harmlessly. // Retrieve the old signal handler from libjsig // Retrieve the preinstalled signal handler from jvm // Call the old signal handler // It's more reasonable to let jvm treat it as an unexpected exception // instead of taking the default action. // automaticlly block the signal // retrieve the chained handler // try to honor the signal mask // call into the chained handler // restore the signal mask // Tell jvm's signal handler the signal is taken care of. if ((( (
unsigned int)
1 <<
sig ) &
sigs) != 0) {
// Do not overwrite; user takes responsibility to forward to us. // save the old handler in jvm // libjsig also interposes the sigaction() call below and saves the // old sigaction on it own. fatal2(
"Encountered unexpected pre-existing sigaction handler %#lx for signal %d.", (
long)
oldhand,
sig);
// Save flags, which are set by ours // install signal handlers for signals that HotSpot needs to // handle in order to support Java-level exception handling. // Tell libjsig jvm is setting signal handlers // Tell libjsig jvm finishes setting signal handlers // We don't activate signal checker if libjsig is in place, we trust ourselves // and if UserSignalHandler is installed all bets are off tty->
print_cr(
"Info: libjsig is activated, all active signal checking is disabled");
tty->
print_cr(
"Info: AllowUserSignalHandlers is activated, all active signal checking is disabled");
// This is the fastest way to get thread cpu time on Linux. // Returns cpu time (user+sys) for any thread, not only for current. // POSIX compliant clocks are implemented in the kernels 2.6.16+. // It might work on 2.6.10+ with a special kernel/glibc patch. // For reference, please, see IEEE Std 1003.1-2004: assert(
rc == 0,
"clock_gettime is expected to return 0 code");
// glibc on Linux platform uses non-documented flag // to indicate, that some special sort of signal // We will never set this flag, and we should // ignore this flag in our diagnostic // See comment for SIGNIFICANT_SIGNAL_MASK define // May be, handler was resetted by VMError? // Check: is it our handler? // It is our signal handler // check for flags, reset system-used one! ", flags was changed from " PTR32_FORMAT ", consider using jsig library",
// This method is a periodic task to check for misbehaving JNI applications // under CheckJNI, we can add any periodic checks here // SEGV and BUS if overridden could potentially prevent // generation of hs*.log in the event of a crash, debugging // such a case can be very challenging, so we absolutely // check the following for a good measure: // ReduceSignalUsage allows the user to override these handlers // only trust the default sigaction, in case it has been interposed // No need to check this sig any longer // No need to check this sig any longer // this is called _before_ the most of global arguments have been parsed char dummy;
/* used to get a guess on initial stack address */ // first_hrtime = gethrtime(); // With LinuxThreads the JavaMain thread pid (primordial thread) // is different than the pid of the java launcher thread. // So, on Linux, the launcher thread pid is passed to the VM // via the sun.java.launcher.pid property. // Use this property instead of getpid() if it was correctly passed. // main_thread points to the aboriginal thread // To install functions for atexit system call // this is called _after_ the global arguments have been parsed // Allocate a single page and mark it as readable for safepoint polling // initialize suspend/resume support - must do this before signal_sets_init() perror(
"SR_initialize failed");
tty->
print_cr(
"\nThe stack size specified is too small, " // Make the stack size a multiple of the page size so that tty->
print_cr(
"[HotSpot is running with %s, %s(%s)]\n",
// There's only one node(they start from 0), disable NUMA. // set the number of file descriptors to max. print out error perror(
"os::init_2 getrlimit failed");
perror(
"os::init_2 setrlimit failed");
// Initialize lock used to serialize thread creation (see os::create_thread) tty->
print_cr(
"There was an error trying to initialize the HPI library.");
// at-exit methods are called in the reverse order of their registration. // atexit functions are called on return from main or as a result of a // call to exit(3C). There can be only 32 of these functions registered // and atexit() does not set errno. // only register atexit functions if PerfAllowAtExitRegistration is set. // atexit functions can be delayed until process exit time, which // can be problematic for embedded VM situations. Embedded VMs should // call DestroyJavaVM() to assure that VM resources are released. // note: perfMemory_exit_helper atexit function may be removed in // the future if the appropriate cleanup code can be added to the // VM_Exit VMOperation's doit method. warning(
"os::init2 atexit(perfMemory_exit_helper) failed");
// initialize thread priority policy // Mark the polling page as unreadable fatal(
"Could not disable polling page");
// Mark the polling page as readable fatal(
"Could not enable polling page");
// Linux doesn't yet have a (official) notion of processor sets, // so just return the number of online processors. // Suspends the target using the signal mechanism and then grabs the PC before // resuming the target. Used by the flat-profiler only // Make sure that it is called by the watcher for the VMThread // NULL context is unexpected, double-check this is the VMThread // failure means pthread_kill failed for some reason - arguably this is // a fatal problem, but such problems are ignored elsewhere // 6292965: LinuxThreads pthread_cond_timedwait() resets FPU control // word back to default 64bit precision if condvar is signaled. Java // wants 53bit precision. Save and restore current value. //////////////////////////////////////////////////////////////////////////////// // decode some bytes around the PC //////////////////////////////////////////////////////////////////////////////// // This does not do anything on Linux. This is basically a hook for being // able to use structured exception handling (thread-local exception filters) // Prevent process from exiting upon "read error" without consuming all CPU return buf[0] ==
'y' ||
buf[0] ==
'Y';
// Is a (classpath) directory empty? // create binary file, rewriting existing file if required // return current position of file pointer // move file pointer to the specified offset // Map a block of memory. // Remap a block of memory. // same as map_memory() on this OS // Unmap a block of memory. assert(
rc == 0,
"pthread_getcpuclockid is expected to return 0 code");
// current_thread_cpu_time(bool) and thread_cpu_time(Thread*, bool) // are used by JVM M&M and JVMTI to get user+sys or user CPU time // current_thread_cpu_time() and thread_cpu_time(Thread*) returns // the fast estimate available on the platform. // return user + sys since the cost is the same // consistent with what current_thread_cpu_time() returns // We first try accessing /proc/<pid>/cpu since this is faster to // process. If this file is not present (linux kernels 2.5 and above) // then we open /proc/<pid>/stat. if (
count !=
3 )
return -
1;
// The /proc/<tid>/stat aggregates per-process usage on // new Linux kernels 2.6+ where NPTL is supported. // The /proc/self/task/<tid>/stat still has the per-thread usage. // There can be no directory /proc/self/task on kernels 2.4 with NPTL // and possibly in some other cases, so we check its availability. // This is executed only once // Skip pid and the command string. Note that we could be dealing with // weird command names, e.g. user could decide to rename java launcher // to "java 1.4.2 :)", then the stat file would look like // 1234 (java 1.4.2 :)) R ... ... // We don't really need to know the command string, just find the last // occurrence of ")" and then start parsing from there. See bug 4726580. if (s ==
NULL )
return -
1;
count =
sscanf(s,
"%*c %d %d %d %d %d %lu %lu %lu %lu %lu %lu %lu",
if (
count !=
12 )
return -
1;
// System loadavg support. Returns -1 if load average cannot be obtained. // Linux doesn't yet have a (official) notion of processor sets, // so just return the system wide load average. "Could not open pause file '%s', continuing immediately.\n",
filename);
* NOTE: the following code is to keep the green threads code * in the libjava.so happy. Once the green threads is removed, * these code will no longer be needed. // Beware -- Some versions of NPTL embody a flaw where pthread_cond_timedwait() can // hang indefinitely. For instance NPTL 0.60 on 2.4.21-4ELsmp is vulnerable. // For specifics regarding the bug see GLIBC BUGID 261237 : // Briefly, pthread_cond_timedwait() calls with an expiry time that's not in the future // will either hang or corrupt the condvar, resulting in subsequent hangs if the condvar // is used. (The simple C test-case provided in the GLIBC bug report manifests the // hang). The JVM is vulernable via sleep(), Object.wait(timo), LockSupport.parkNanos() // and monitorenter when we're using 1-0 locking. All those operations may result in // calls to pthread_cond_timedwait(). Using LD_ASSUME_KERNEL to use an older version // of libpthread avoids the problem, but isn't practical. // 1. Establish a minimum relative wait time. 50 to 100 msecs seems to work. // This is palliative and probabilistic, however. If the thread is preempted // between the call to compute_abstime() and pthread_cond_timedwait(), more // than the minimum period may have passed, and the abstime may be stale (in the // past) resultin in a hang. Using this technique reduces the odds of a hang // but the JVM is still vulnerable, particularly on heavily loaded systems. // 2. Modify park-unpark to use per-thread (per ParkEvent) pipe-pairs instead // of the usual flag-condvar-mutex idiom. The write side of the pipe is set // NDELAY. unpark() reduces to write(), park() reduces to read() and park(timo) // reduces to poll()+read(). This works well, but consumes 2 FDs per extant // 3. Embargo pthread_cond_timedwait() and implement a native "chron" thread // that manages timeouts. We'd emulate pthread_cond_timedwait() by enqueuing // a timeout request to the chron thread and then blocking via pthread_cond_wait(). // This also works well. In fact it avoids kernel-level scalability impediments // on certain platforms that don't handle lots of active pthread_cond_timedwait() // timers in a graceful fashion. // 4. When the abstime value is in the past it appears that control returns // correctly from pthread_cond_timedwait(), but the condvar is left corrupt. // Subsequent timedwait/wait calls may hang indefinitely. Given that, we // can avoid the problem by reinitializing the condvar -- by cond_destroy() // followed by cond_init() -- after all calls to pthread_cond_timedwait(). // It may be possible to avoid reinitialization by checking the return // value from pthread_cond_timedwait(). In addition to reinitializing the // condvar we must establish the invariant that cond_signal() is only called // within critical sections protected by the adjunct mutex. This prevents // cond_signal() from "seeing" a condvar that's in the midst of being // reinitialized or that is corrupt. Sadly, this invariant obviates the // desirable signal-after-unlock optimization that avoids futile context switching. // I'm also concerned that some versions of NTPL might allocate an auxilliary // structure when a condvar is used or initialized. cond_destroy() would // release the helper structure. Our reinitialize-after-timedwait fix // put excessive stress on malloc/free and locks protecting the c-heap. // We currently use (4). See the WorkAroundNTPLTimedWaitHang flag. // It may be possible to refine (4) by checking the kernel and NTPL verisons // and only enabling the work-around for vulnerable environments. // utility to compute the abstime argument to timedwait: // millis is the relative timeout time // abstime will be the absolute timeout time // TODO: replace compute_abstime() with unpackTime() if (
seconds >
50000000) {
// see man cond_timedwait(3T) // Test-and-clear _Event, always leaves _Event set to 0, returns immediately. // Conceptually TryPark() should be equivalent to park(0). guarantee ((v == 0) || (v ==
1),
"invariant") ;
// TODO: assert that _Assoc != NULL or _Assoc == Self // Do this the hard way by blocking ... // for some reason, under 2.7 lwp_cond_wait() may return ETIME ... // Treat this the same as if the wait was interrupted // In theory we could move the ST of 0 into _Event past the unlock(), // but then we'd need a MEMBAR after the ST. if (v != 0)
return OS_OK ;
// We do this the hard way, by blocking the thread. // Consider enforcing a minimum timeout value. // Object.wait(timo) will return because of // Thread.interrupt and object.notify{All} both call Event::set. // That is, we treat thread.interrupt as a special case of notification. // The underlying Solaris implementation, cond_timedwait, admits // JVM from making those visible to Java code. As such, we must // filter out spurious wakeups. We assume all ETIME returns are valid. // TODO: properly differentiate simultaneous notify+interrupt. // In that case, we should propagate the notify to another waiter. // We consume and ignore EINTR and spurious wakeups. // The LD of _Event could have reordered or be satisfied // by a read-aside from this processor's write buffer. // To avoid problems execute a barrier and then // Wait for the thread associated with the event to vacate // Note that we signal() _after dropping the lock for "immortal" Events. // This is safe and avoids a common class of futile wakeups. In rare // circumstances this can cause a thread to return prematurely from // cond_{timed}wait() but the spurious wakeup is benign and the victim will // simply re-test the condition and re-park itself. // ------------------------------------------------------- * The solaris and linux implementations of park/unpark are fairly * conservative for now, but can be improved. They currently use a * Park decrements count if > 0, else does a condvar wait. Unpark * sets count to 1 and signals condvar. Only one thread ever waits * on the condvar. Contention seen when trying to park implies that someone * is unparking you, so don't wait. And spurious returns are fine, so there * is no need to track notifications. * This code is common to linux and solaris and will be moved to a * common place in dolphin. * The passed in time value is either a relative time in nanoseconds * or an absolute time in milliseconds. Either way it has to be unpacked * into suitable seconds and nanoseconds components and stored in the * given timespec structure. * Given time is a 64-bit value and the time_t used in the timespec is only * a signed-32-bit value (except on 64-bit Linux) we have to watch for * overflow if times way in the future are given. Further on Solaris versions * prior to 10 there is a restriction (see cond_timedwait) that the specified * number of seconds, in abstime, is less than current_time + 100,000,000. * As it will be 28 years before "now + 100000000" will overflow we can * ignore overflow and just impose a hard-limit on seconds using the value * of "now + 100,000,000". This places a limit on the timeout of about 3.17 // Optional fast-path check: // Return immediately if a permit is available. // Optional optimization -- avoid state transitions if there's an interrupt pending. // Check interrupt before trying to wait if (
time < 0) {
// don't wait at all // Enter safepoint region // Beware of deadlocks such as 6317397. // The per-thread Parker:: mutex is a classic leaf-lock. // In particular a thread must never block on the Threads_lock while // holding the Parker:: mutex. If safepoints are pending both the // the ThreadBlockInVM() CTOR and DTOR may grab Threads_lock. // Don't wait if cannot get lock since interference arises from // unblocking. Also. check interrupt before trying wait // Don't catch signals while blocked; let the running threads have the signals. // (This allows a debugger to break into the running thread.) // cleared by handle_special_suspend_equivalent_condition() or java_suspend_self() // If externally suspended while waiting, re-suspend // Run the specified command in a separate process. Return its exit value, // or -1 on failure (e.g. can't fork a new process). // Unlike system(), this function can be called from signal handler. It // doesn't block SIGINT et al. // pthread_atfork handlers and reset pthread library. All we need is a // separate process to execve. Make a direct syscall to fork process. // On IA64 there's no fork syscall, we have to use fork() and hope for // execve() in LinuxThreads will call pthread_kill_other_threads_np() // first to kill every thread on the thread list. Because this list is // not reset by fork() (see notes above), execve() will instead kill // every thread in the parent process. We know this is the only thread // in the new process, so make a system call directly. // IA64 should use normal execve() from glibc to match the glibc fork() // care about the actual exit code, for now. // Wait for the child process to exit. This returns immediately if // the child has already exited. */ // The child exited normally; get its exit code. // The child exited because of a signal // The best value to return is 0x80 + signal number, // because that is what all Unix shells do, and because // it allows callers to distinguish between process exit and // process death by signal. // Unknown exit code; pass it through