macro.cpp revision 0
0N/A * Copyright 2005-2007 Sun Microsystems, Inc. All Rights Reserved. 0N/A * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 0N/A * This code is free software; you can redistribute it and/or modify it 0N/A * under the terms of the GNU General Public License version 2 only, as 0N/A * published by the Free Software Foundation. 0N/A * This code is distributed in the hope that it will be useful, but WITHOUT 0N/A * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 0N/A * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 0N/A * version 2 for more details (a copy is included in the LICENSE file that 0N/A * accompanied this code). 0N/A * You should have received a copy of the GNU General Public License version 0N/A * 2 along with this work; if not, write to the Free Software Foundation, 0N/A * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 0N/A * Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara, 0N/A * CA 95054 USA or visit www.sun.com if you need additional information or 0N/A * have any questions. 0N/A#
include "incls/_precompiled.incl" 0N/A// Replace any references to "oldref" in inputs to "use" with "newref". 0N/A// Returns the number of replacements made. 0N/A // Copy debug information and adjust JVMState information 0N/A // Fast path taken; set region slot 2 0N/A // Fast path not-taken, i.e. slow path 0N/A//--------------------copy_predefined_input_for_runtime_call-------------------- 0N/A // Set fixed predefined input arguments 0N/A//------------------------------make_slow_call--------------------------------- 0N/A // Slow path call has no side-effects, uses few values 0N/A // For Control (fallthrough) and I_O (catch_all_index) we have CatchProj -> Catch -> Proj 0N/A assert(
false,
"unexpected projection from allocation node.");
0N/A//---------------------------set_eden_pointers------------------------- 0N/A if (
UseTLAB) {
// Private allocation: load from TLS 0N/A }
else {
// Shared allocation: load from globals 0N/A//============================================================================= 0N/A// A L L O C A T I O N 0N/A// Allocation attempts to be fast in the case of frequent small objects. 0N/A// It breaks down like this: 0N/A// 1) Size in doublewords is computed. This is a constant for objects and 0N/A// variable for most arrays. Doubleword units are used to avoid size 0N/A// overflow of huge doubleword arrays. We need doublewords in the end for 0N/A// 2) Size is checked for being 'too large'. Too-large allocations will go 0N/A// the slow path into the VM. The slow path can throw any required 0N/A// exceptions, and does all the special checks for very large arrays. The 0N/A// size test can constant-fold away for objects. For objects with 0N/A// finalizers it constant-folds the otherway: you always go slow with 0N/A// 3) If NOT using TLABs, this is the contended loop-back point. 0N/A// Load-Locked the heap top. If using TLABs normal-load the heap top. 0N/A// 4) Check that heap top + size*8 < max. If we fail go the slow ` route. 0N/A// NOTE: "top+size*8" cannot wrap the 4Gig line! Here's why: for largish 0N/A// "size*8" we always enter the VM, where "largish" is a constant picked small 0N/A// enough that there's always space between the eden max and 4Gig (old space is 0N/A// there so it's quite large) and large enough that the cost of entering the VM 0N/A// is dwarfed by the cost to initialize the space. 0N/A// 5) If NOT using TLABs, Store-Conditional the adjusted heap top back 0N/A// down. If contended, repeat at step 3. If using TLABs normal-store 0N/A// adjusted heap top back down; there is no contention. 0N/A// 6) If !ZeroTLAB then Bulk-clear the object/array. Fill in klass & mark 0N/A// 7) Merge with the slow-path; cast the raw memory pointer to the correct 0N/A//============================================================================= 0N/A// FastAllocateSizeLimit value is in DOUBLEWORDS. 0N/A// Allocations bigger than this always go the slow route. 0N/A// This value must be small enough that allocation attempts that need to 0N/A// trigger exceptions go the slow route. Also, it must be small enough so 0N/A// that heap_top + size_in_bytes does not wrap around the 4Gig limit. 0N/A//=============================================================================j// 0N/A// The allocator will coalesce int->oop copies away. See comment in 0N/A// code shape produced here, so if you are changing this code shape 0N/A// make sure the GC info for the heap-top is correct in and around the 0N/A // Load Eden::end. Loop invariant and hoisted. 0N/A // Note: We set the control input on "eden_end" and "old_eden_top" when using 0N/A // a TLAB to work around a bug where these values were being moved across 0N/A // a safepoint. These are not oops, so they cannot be include in the oop 0N/A // map, but the can be changed by a GC. The proper way to fix this would 0N/A // be to set the raw memory state when generating a SafepointNode. However 0N/A // this will require extensive changes to the loop optimization in order to 0N/A // prevent a degradation of the optimization. 0N/A // We need a Region and corresponding Phi's to merge the slow-path and fast-path results. 0N/A // they will not be used if "always_slow" is set 0N/A // The initial slow comparison is a size check, the comparison 0N/A // we want to do is a BoolTest::gt 0N/A // Force slow-path allocation 0N/A // generate the initial test if necessary 0N/A // Now make the initial failure test. Usually a too-big test but 0N/A // might be a TRUE for finalizers or a fancy class check for 0N/A // Plug the failing-too-big test into the slow-path region 0N/A }
else {
// No initial test, just fall into next case 0N/A // generate the fast allocation code unless we know that the initial test will always go slow 0N/A // allocate the Region and Phi nodes for the result 0N/A // We need a Region for the loop-back contended case. 0N/A // Now handle the passing-too-big test. We fall into the contended 0N/A // loop-back merge point. 0N/A // Load(-locked) the heap top. 0N/A // See note above concerning the control input when using a TLAB 0N/A // Add to heap top to get a new heap top 0N/A // Check for needing a GC; compare against heap end 0N/A // Plug the failing-heap-space-need-gc test into the slow-path region 0N/A // This completes all paths into the slow merge point 0N/A }
else {
// No initial slow path needed! 0N/A // Just fall from the need-GC path straight into the VM call. 0N/A // No need for a GC. Setup for the Store-Conditional 0N/A // Grab regular I/O before optional prefetch may change it. 0N/A // Slow-path does no I/O so just set it to the original I/O. 0N/A // Store (-conditional) the modified eden top back down. 0N/A // StorePConditional produces flags for a test PLUS a modified raw 0N/A // If not using TLABs, check to see if there was contention. 0N/A // If contention, loopback and try again. 0N/A // Fast-path succeeded with no contention! 0N/A // Rename successful fast-path variables to make meaning more obvious 0N/A "dtrace_object_alloc",
0N/A // Get base of thread-local storage area 0N/A // Plug in the successful fast-path into the result merge point 0N/A // Generate slow-path call 0N/A // Copy debug information and adjust JVMState information, then replace 0N/A // allocate node with the call 0N/A // Identify the output projections from the allocate node and 0N/A // adjust any references to them. 0N/A // The control and io projections look like: 0N/A // v---Proj(ctrl) <-----+ v---CatchProj(ctrl) 0N/A // ^---Proj(io) <-------+ ^---CatchProj(io) 0N/A // We are interested in the CatchProj nodes. 0N/A // An allocate node has separate memory projections for the uses on the control and i_o paths 0N/A // Replace uses of the control memory projection with result_phi_rawmem (unless we are only generating a slow call) 0N/A // Now change uses of _memproj_catchall to use _memproj_fallthrough and delete _memproj_catchall so 0N/A // we end up with a call that has only 1 memory projection 0N/A // An allocate node has separate i_o projections for the uses on the control and i_o paths 0N/A // Replace uses of the control i_o projection with result_phi_i_o (unless we are only generating a slow call) 0N/A // Now change uses of _ioproj_catchall to use _ioproj_fallthrough and delete _ioproj_catchall so 0N/A // we end up with a call that has only 1 control projection 0N/A // if we generated only a slow call, we are done 0N/A // no uses of the allocation result 0N/A // Plug slow-path into result merge point 0N/A // This completes all paths into the result merge point 0N/A// Helper for PhaseMacroExpand::expand_allocate_common. 0N/A// Initializes the newly-allocated storage. 0N/A // Store the klass & mark bits 0N/A // For now only enable fast locking for non-array types 0N/A // conservatively small header size: 0N/A // Clear the object body, if necessary. 0N/A // The init has somehow disappeared; be cautious and clear everything. 0N/A // This can happen if a node is allocated but an uncommon trap occurs 0N/A // immediately. In this case, the Initialize gets associated with the 0N/A // trap, and may be placed in a different (outer) loop, if the Allocate 0N/A // is in a loop. If (this is rare) the inner loop gets unrolled, then 0N/A // there can be two Allocates to one Initialize. The answer in all these 0N/A // edge cases is safety first. It is always safe to clear immediately 0N/A // within an Allocate, and then (maybe or maybe not) clear some more later. 0N/A // Try to win by zeroing only what the init does not store. 0N/A // We can also try to do some peephole optimizations, 0N/A // such as combining some adjacent subword stores. 0N/A // We have no more use for this link, since the AllocateNode goes away: 0N/A // (If we keep the link, it just confuses the register allocator, 0N/A // who thinks he sees a real use of the address by the membar.) 0N/A// Generate prefetch instructions for next allocations. 0N/A // Generate prefetch allocation with watermark check. 0N/A // As an allocation hits the watermark, we will prefetch starting 0N/A // at a "distance" away from watermark. 0N/A // I/O is used for Prefetch 0N/A // check against new_eden_top 0N/A // true node, add prefetchdistance 0N/A // adding prefetches 0N/A // Insert a prefetch for each allocation only on the fast-path 0N/A // Generate several prefetch instructions only for arrays. 0N/A // Do not let it float too high, since if eden_top == eden_end, 0N/A // both might be null. 0N/A if( i == 0 ) {
// Set control for first prefetch, next follows it 0N/A// we have determined that this lock/unlock can be eliminated, we simply 0N/A// eliminate the node without expanding it. 0N/A// Note: The membar's associated with the lock/unlock are currently not 0N/A// eliminated. This should be investigated as a future enhancement. 0N/A // The input to a Lock is merged memory, so extract its RawMem input 0N/A // (unless the MergeMem has been optimized away.) 0N/A // There are 2 projections from the lock. The lock node will 0N/A // be deleted when its last use is subsumed below. 0N/A//------------------------------expand_lock_node---------------------- 0N/A // Make the merge point 0N/A // Optimize test; set region slot 2 0N/A // Make slow path call 0N/A // Slow path can only throw asynchronous exceptions, which are always 0N/A // de-opted. So the compiler thinks the slow-call can never throw an 0N/A // exception. If it DOES throw an exception we would need the debug 0N/A // info removed first (since if it throws there is no monitor). 0N/A // Capture slow path 0N/A // disconnect fall-through projection from call and create a new one 0N/A // hook up users of fall-through projection to region 0N/A // region inputs are now complete 0N/A // create a Phi for the memory state 0N/A//------------------------------expand_unlock_node---------------------- 0N/A // No need for a null check on unlock 0N/A // Make the merge point 0N/A // Optimize test; set region slot 2 0N/A // No exceptions for unlocking 0N/A // Capture slow path 0N/A // disconnect fall-through projection from call and create a new one 0N/A // hook up users of fall-through projection to region 0N/A // region inputs are now complete 0N/A // create a Phi for the memory state 0N/A//------------------------------expand_macro_nodes---------------------- 0N/A// Returns true if a failure occurred. 0N/A // Make sure expansion will not cause node limit to be exceeded. Worst case is a 0N/A // macro node gets expanded into about 50 nodes. Allow 50% more for optimization 0N/A // expand "macro" nodes 0N/A // nodes are removed from the macro list as they are processed 0N/A // node is unreachable, so don't try to expand it 0N/A assert(
false,
"unknown node type in macro list");