safepoint.cpp revision 0
0N/A * Copyright 1997-2007 Sun Microsystems, Inc. All Rights Reserved. 0N/A * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 0N/A * This code is free software; you can redistribute it and/or modify it 0N/A * under the terms of the GNU General Public License version 2 only, as 0N/A * published by the Free Software Foundation. 0N/A * This code is distributed in the hope that it will be useful, but WITHOUT 0N/A * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 0N/A * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 0N/A * version 2 for more details (a copy is included in the LICENSE file that 0N/A * accompanied this code). 0N/A * You should have received a copy of the GNU General Public License version 0N/A * 2 along with this work; if not, write to the Free Software Foundation, 0N/A * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 0N/A * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, 0N/A * CA 95054 USA or visit www.sun.com if you need additional information or 0N/A * have any questions. 0N/A#
include "incls/_precompiled.incl" 0N/A// -------------------------------------------------------------------------------------------------- 0N/Astatic volatile int PageArmed = 0 ;
// safepoint polling page is RO|RW vs PROT_NONE 0N/Astatic volatile int TryingToBlock = 0 ;
// proximate value -- for advisory use only 0N/A// Roll all threads forward to a safepoint and suspend them all 0N/A // In the future we should investigate whether CMS can use the 0N/A // more-general mechanism below. DLD (01/05). 0N/A // By getting the Threads_lock, we assure that no threads are about to start or 0N/A // exit. It is released again in SafepointSynchronize::end(). 0N/A // Set number of threads to wait for, before we initiate the callbacks 0N/A // Save the starting time, so that it can be compared to see if this has taken 0N/A // too long to complete. 0N/A // Begin the process of bringing the system to a safepoint. 0N/A // Java threads can be in several different states and are 0N/A // stopped by different mechanisms: 0N/A // 1. Running interpreted 0N/A // The interpeter dispatch table is changed to force it to 0N/A // check for a safepoint condition between bytecodes. 0N/A // 2. Running in native code 0N/A // When returning from the native code, a Java thread must check 0N/A // the safepoint _state to see if we must block. If the 0N/A // VM thread sees a Java thread in native, it does 0N/A // not wait for this thread to block. The order of the memory 0N/A // writes and reads of both the safepoint state and the Java 0N/A // threads state is critical. In order to guarantee that the 0N/A // memory writes are serialized with respect to each other, 0N/A // the VM thread issues a memory barrier instruction 0N/A // (on MP systems). In order to avoid the overhead of issuing 0N/A // a memory barrier for each Java thread making native calls, each Java 0N/A // thread performs a write to a single memory page after changing 0N/A // the thread state. The VM thread performs a sequence of 0N/A // mprotect OS calls which forces all previous writes from all 0N/A // Java threads to be serialized. This is done in the 0N/A // os::serialize_thread_states() call. This has proven to be 0N/A // much more efficient than executing a membar instruction 0N/A // on every call to native code. 0N/A // 3. Running compiled Code 0N/A // Compiled code reads a global (Safepoint Polling) page that 0N/A // is set to fault if we are trying to get to a safepoint. 0N/A // A thread which is blocked will not be allowed to return from the 0N/A // block condition until the safepoint operation is complete. 0N/A // 5. In VM or Transitioning between states 0N/A // If a Java thread is currently running in the VM or transitioning 0N/A // between states, the safepointing code will wait for the thread to 0N/A // block itself when it attempts transitions to a new state. 0N/A // Flush all thread states to memory 0N/A // Make interpreter safepoint aware 0N/A // Make polling safepoint aware 0N/A // Consider using active_processor_count() ... but that call is expensive. 0N/A // Iterate through all threads until it have been determined how to stop them all at a safepoint 0N/A // consider adjusting steps downward: 0N/A // steps = MIN(steps, 2000-100) 0N/A // if (iterations != 0) steps -= NNN 0N/A // Check for if it takes to long 0N/A // Spin to avoid context switching. 0N/A // There's a tension between allowing the mutators to run (and rendezvous) 0N/A // vs spinning. As the VM thread spins, wasting cycles, it consumes CPU that 0N/A // a mutator might otherwise use profitably to reach a safepoint. Excessive 0N/A // spinning by the VM thread on a saturated system can increase rendezvous latency. 0N/A // Blocking or yielding incur their own penalties in the form of context switching 0N/A // and the resultant loss of $ residency. 0N/A // Further complicating matters is that yield() does not work as naively expected 0N/A // on many platforms -- yield() does not guarantee that any other ready threads 0N/A // will run. As such we revert yield_all() after some number of iterations. 0N/A // Yield_all() is implemented as a short unconditional sleep on some platforms. 0N/A // Typical operating systems round a "short" sleep period up to 10 msecs, so sleeping 0N/A // can actually increase the time it takes the VM thread to detect that a system-wide 0N/A // stop-the-world safepoint has been reached. In a pathological scenario such as that 0N/A // described in CR6415670 the VMthread may sleep just before the mutator(s) become safe. 0N/A // In that case the mutators will be stalled waiting for the safepoint to complete and the 0N/A // the VMthread will be sleeping, waiting for the mutators to rendezvous. The VMthread 0N/A // will eventually wake up and detect that all mutators are safe, at which point 0N/A // we'll again make progress. 0N/A // Beware too that that the VMThread typically runs at elevated priority. 0N/A // Its default priority is higher than the default mutator priority. 0N/A // Obviously, this complicates spinning. 0N/A // Note too that on Windows XP SwitchThreadTo() has quite different behavior than Sleep(0). 0N/A // Sleep(0) will _not yield to lower priority threads, while SwitchThreadTo() will. 0N/A // In the future we might: 0N/A // 1. Modify the safepoint scheme to avoid potentally unbounded spinning. 0N/A // This is tricky as the path used by a thread exiting the JVM (say on 0N/A // on JNI call-out) simply stores into its state field. The burden 0N/A // is placed on the VM thread, which must poll (spin). 0N/A // 2. Find something useful to do while spinning. If the safepoint is GC-related 0N/A // we might aggressively scan the stacks of threads that are already safe. 0N/A // 3. Use Solaris schedctl to examine the state of the still-running mutators. 0N/A // If all the mutators are ONPROC there's no reason to sleep or yield. 0N/A // 4. YieldTo() any still-running mutators that are ready but OFFPROC. 0N/A // 5. Check system saturation. If the system is not fully saturated then 0N/A // 6. As still-running mutators rendezvous they could unpark the sleeping 0N/A // VMthread. This works well for still-running mutators that become 0N/A // safe. The VMthread must still poll for mutators that call-out. 0N/A // 7. Drive the policy on time-since-begin instead of iterations. 0N/A // 8. Consider making the spin duration a function of the # of CPUs: 0N/A // Spin = (((ncpus-1) * M) + K) + F(still_running) 0N/A // Alternately, instead of counting iterations of the outer loop 0N/A // we could count the # of threads visited in the inner loop, above. 0N/A // 9. On windows consider using the return value from SwitchThreadTo() 0N/A // Instead of (ncpus > 1) consider either (still_running < (ncpus + EPSILON)) or 0N/A // ((still_running + _waiting_to_block - TryingToBlock)) < ncpus) 0N/A // Alternately, the VM thread could transiently depress its scheduling priority or 0N/A // transiently increase the priority of the tardy mutator(s). 0N/A // wait until all threads are stopped 0N/A // Compute remaining time 0N/A // If there is no remaining time, then there is an error 0N/A // Call stuff that needs to be run when a safepoint is just about to be completed 0N/A// Wake up all threads, so they are ready to resume execution after the safepoint 0N/A// operation has been carried out 0N/A // memory fence isn't required here since an odd _safepoint_counter 0N/A // value can do no harm and a fence is issued below anyway. 0N/A // A pending_exception cannot be installed during a safepoint. The threads 0N/A // may install an async exception after they come back from a safepoint into 0N/A // pending_exception after they unblock. But that should happen later. 0N/A "safepoint installed a pending exception");
0N/A // Make polling safepoint aware 0N/A // Remove safepoint check from interpreter 0N/A // Set to not synchronized, so the threads will not go into the signal_thread_blocked method 0N/A // when they get restarted. 0N/A // Start suspended threads 0N/A // A problem occuring on Solaris is when attempting to restart threads 0N/A // the first #cpus - 1 go well, but then the VMThread is preempted when we get 0N/A // to the next one (since it has been running the longest). We then have 0N/A // to wait for a cpu to become available before we can continue restarting 0N/A // FIXME: This causes the performance of the VM to degrade when active and with 0N/A // large numbers of threads. Apparently this is due to the synchronous nature 0N/A // of suspending threads. 0N/A // TODO-FIXME: the comments above are vestigial and no longer apply. 0N/A // Furthermore, using solaris' schedctl in this particular context confers no benefit 0N/A // Release threads lock, so threads can be created/destroyed again. It will also starts all threads 0N/A // blocked in signal_thread_blocked 0N/A // If there are any concurrent GC threads resume them. 0N/A // Need a safepoint if some inline cache buffers is non-empty 0N/A // This operation is going to be performed only at the end of a safepoint 0N/A // and hence GC's will not be going on, all Java mutators are suspended 0N/A // at this point and hence SystemDictionary_lock is also not needed. 0N/A// Various cleaning tasks that should be done periodically at safepoints 0N/A // Update fat-monitor pool, since this is a safepoint. 0N/A // native threads are safe if they have no java stack or have walkable stack 0N/A // blocked threads should have already have walkable stack 0N/A// ------------------------------------------------------------------------------------------------------- 0N/A// Implementation of Safepoint callback point 0N/A // Threads shouldn't block if they are in the middle of printing, but... 0N/A // Only bail from the block() call if the thread is gone from the 0N/A // thread list; starting to exit should still block. 0N/A // block current thread if we come here from native code when VM is gone 0N/A // otherwise do nothing 0N/A // Check that we have a valid thread_state at this point 0N/A // We are highly likely to block on the Safepoint_lock. In order to avoid blocking in this case, 0N/A // we pretend we are still in the VM. 0N/A // We will always be holding the Safepoint_lock when we are examine the state 0N/A // of a thread. Hence, the instructions between the Safepoint_lock->lock() and 0N/A // Safepoint_lock->unlock() are happening atomic with regards to the safepoint code 0N/A // Decrement the number of threads to wait for and signal vm thread 0N/A // Consider (_waiting_to_block < 2) to pipeline the wakeup of the VM thread 0N/A // We transition the thread to state _thread_blocked here, but 0N/A // we can't do our usual check for external suspension and then 0N/A // self-suspend after the lock_without_safepoint_check() call 0N/A // below because we are often called during transitions while 0N/A // we hold different locks. That would leave us suspended while 0N/A // holding a resource which results in deadlocks. 0N/A // We now try to acquire the threads lock. Since this lock is hold by the VM thread during 0N/A // the entire safepoint, the threads will all line up here during the safepoint. 0N/A // restore original state. This is important if the thread comes from compiled code, so it 0N/A // will continue to execute with the _thread_in_Java state. 0N/A "Should have called back to the VM before blocking.");
0N/A // We transition the thread to state _thread_blocked here, but 0N/A // we can't do our usual check for external suspension and then 0N/A // self-suspend after the lock_without_safepoint_check() call 0N/A // below because we are often called during transitions while 0N/A // we hold different locks. That would leave us suspended while 0N/A // holding a resource which results in deadlocks. 0N/A // It is not safe to suspend a thread if we discover it is in _thread_in_native_trans. Hence, 0N/A // the safepoint code might still be waiting for it to block. We need to change the state here, 0N/A // so it can see that it is at a safepoint. 0N/A // Block until the safepoint operation is completed. 0N/A // Check for pending. async. exceptions or suspends - except if the 0N/A // thread was blocked inside the VM. has_special_runtime_exit_condition() 0N/A // is called last since it grabs a lock and we only want to do that when 0N/A // Note: we never deliver an async exception at a polling point as the 0N/A // compiler may not have an exception handler for it. The polling 0N/A // code will notice the async and deoptimize and the exception will 0N/A // be delivered. (Polling at a return point is ok though). Sure is 0N/A // a lot of bother for a deprecated feature... 0N/A // We don't deliver an async exception if the thread state is 0N/A // _thread_in_native_trans so JNI functions won't be called with 0N/A // a surprising pending exception. If the thread state is going back to java, 0N/A // async exception is checked in check_special_condition_for_native_trans(). 0N/A// ------------------------------------------------------------------------------------------------------ 0N/A// Exception handlers 0N/A tty->
print_cr(
"--------+------address-----+------before-----------+-------after----------+");
0N/A const int incr =
1;
// Increment to skip a long, in units of intptr_t 0N/A tty->
print_cr(
"--------+--address-+------before-----------+-------after----------+");
0N/A const int incr =
2;
// Increment to skip a long, in units of intptr_t 0N/A for(
int i=0; i<
16; i++ ) {
0N/A // Sparc safepoint-blob frame structure. 0N/A intptr_t* sp = thread->last_Java_sp(); 0N/A intptr_t stack_copy[150]; 0N/A for( int i=0; i<150; i++ ) stack_copy[i] = sp[i]; 0N/A for( int i=0; i<150; i++ ) 0N/A was_oops[i] = stack_copy[i] ? ((oop)stack_copy[i])->is_oop() : false; 0N/A // print_me(sp,stack_copy,was_oops); 0N/A // Print out the thread infor which didn't reach the safepoint for debugging 0N/A // purposes (useful when there are lots of threads in the debugger). 0N/A tty->
print_cr(
"# SafepointSynchronize::begin: Timed out while spinning to reach a safepoint.");
0N/A tty->
print_cr(
"# SafepointSynchronize::begin: Timed out while waiting for threads to stop.");
0N/A tty->
print_cr(
"# SafepointSynchronize::begin: Threads which did not reach the safepoint:");
0N/A // To debug the long safepoint, specify both DieOnSafepointTimeout & 0N/A // ShowMessageBoxOnError. 0N/A sprintf(
msg,
"Safepoint sync time longer than %d ms detected when executing %s.",
0N/A// ------------------------------------------------------------------------------------------------------- 0N/A// Implementation of ThreadSafepointState 0N/A // Check for a thread that is suspended. Note that thread resume tries 0N/A // to grab the Threads_lock which we own here, so a thread cannot be 0N/A // resumed during safepoint synchronization. 0N/A // We check with locking because another thread that has not yet 0N/A // synchronized may be trying to suspend this one. 0N/A // Some JavaThread states have an initial safepoint state of 0N/A // running, but are actually at a safepoint. We will happily 0N/A // agree and update the safepoint state here. 0N/A // All other thread states will continue to run until they 0N/A // transition and self-block in state _blocked 0N/A // Safepoint polling in compiled code causes the Java threads to do the same. 0N/A // Note: new threads may require a malloc so they must be allowed to finish 0N/A// Returns true is thread could not be rolled forward at present position. 0N/A " [0x%2x] State: %s _has_called_back %d _at_poll_safepoint %d",
0N/A// --------------------------------------------------------------------------------------------------------------------- 0N/A// Block the thread at the safepoint poll or poll return. 0N/A // Check state. block() will set thread state to thread_in_vm which will 0N/A // cause the safepoint state _type to become _call_back. 0N/A "polling page exception on thread not running state");
0N/A // Step 1: Find the nmethod from the return address 0N/A // Find frame of caller 0N/A // Should only be poll_return or poll 0N/A // This is a poll immediately before a return. The exception handling code 0N/A // has already had the effect of causing the return to occur, so the execution 0N/A // will continue immediately after the call. In addition, the oopmap at the 0N/A // return point does not mark the return value as an oop (if it is), so 0N/A // it needs a handle here to be updated. 0N/A // See if return type is an oop. 0N/A // The oop result has been saved on the stack together with all 0N/A // the other registers. In order to preserve it over GCs we need 0N/A // to keep it in a handle. 0N/A // restore oop result, if any 0N/A // This is a safepoint poll. Verify the return address and block. 0N/A // verify the blob built the "return address" correctly 0N/A // If we have a pending async exception deoptimize the frame 0N/A // as otherwise we may never deliver it. 0N/A // If an exception has been installed we must check for a pending deoptimization 0N/A // Deoptimize frame if exception has been thrown. 0N/A // The exception patch will destroy registers that are still 0N/A // live and will be needed during deoptimization. Defer the 0N/A // Async exception should have defered the exception until the 0N/A // next safepoint which will be detected when we get into 0N/A // the interpreter so if we have an exception now things 0N/A fatal(
"Exception installed and deoptimization is pending");
0N/A// Statistics & Instrumentations 0N/A// last_safepoint_start_time records the start time of last safepoint. 0N/A fatal(
"Wrong PrintSafepointStatisticsCount");
0N/A // If PrintSafepointStatisticsTimeout is specified, the statistics data will 0N/A // be printed right away, in which case, _safepoint_stats will regress to 0N/A // a single element array. Otherwise, it is a circular ring buffer with default 0N/A // size of PrintSafepointStatisticsCount. 0N/A "not enough memory for safepoint instrumentation data");
0N/A "[threads: total initially_running wait_to_block] ");
0N/A "[vmop_time time_elapsed] ");
0N/A // no page armed status printed out if it is always armed. 0N/A // Records the start time of spinning. The real time spent on spinning 0N/A // will be adjusted when spin is done. Same trick is applied for time 0N/A // spent on waiting for threads to block. 0N/A // Records the start time of waiting for to block. Updated when block is done. 0N/A // Records the end time of sync which will be used to calculate the total 0N/A // vm operation time. Again, the real time spending in syncing will be deducted 0N/A // from the start of the sync time later when end_statistics is called. 0N/A // Update the vm operation time. 0N/A // Only the sync time longer than the specified 0N/A // PrintSafepointStatisticsTimeout will be printed out right away. 0N/A // By default, it is -1 meaning all samples will be put into the list. 0N/A // The safepoint statistics will be printed out when the _safepoin_stats 0N/A // "/ MICROUNITS " is to convert the unit from nanos to millis. 0N/A// This method will be called when VM exits. It will first call 0N/A// print_statistics to print out the rest of the sampling. Then 0N/A// it tries to summarize the sampling. 0N/A // During VM exit, end_statistics may not get called and in that 0N/A // case, if the sync time is less than PrintSafepointStatisticsTimeout, 0N/A // don't print it out. 0N/A // Approximate the vm op time. 0N/A // Print out polling page sampling status. 0N/A// ------------------------------------------------------------------------------------------------