memnode.cpp revision 3806
2167N/A * Copyright (c) 1997, 2011, Oracle and/or its affiliates. All rights reserved. 0N/A * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 0N/A * This code is free software; you can redistribute it and/or modify it 0N/A * under the terms of the GNU General Public License version 2 only, as 0N/A * published by the Free Software Foundation. 0N/A * This code is distributed in the hope that it will be useful, but WITHOUT 0N/A * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 0N/A * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 0N/A * version 2 for more details (a copy is included in the LICENSE file that 0N/A * accompanied this code). 0N/A * You should have received a copy of the GNU General Public License version 0N/A * 2 along with this work; if not, write to the Free Software Foundation, 0N/A * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 1472N/A * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 0N/A// Portions of code courtesy of Clifford Click 0N/A// Optimization - Graph Style 0N/A//============================================================================= 0N/A // fake the missing field 74N/A return mchain;
// don't try to optimize non-instance types 605N/A break;
// hit one of our sentinels 74N/A // skip over a call which does not affect this memory slice 605N/A break;
// hit one of our sentinels 74N/A // Stop if this is the initialization for the object instance which 74N/A // which contains this memory slice, otherwise skip over it. 1100N/A // Can not bypass initialization of the instance 1100N/A // Otherwise skip it (the call updated 'result' value). 74N/A // clone the Phi with our address type 64N/A // Check that current type is consistent with the alias index used during graph construction 64N/A // Sometimes dead array references collapse to a[-1], a[-2], or a[-3] 64N/A // don't assert if it is dead code. 1735N/A // TypeOopPtr::NOTNULL+any is an OOP with unknown offset - generally 64N/A // means an array I have not precisely typed yet. Do not do any 64N/A // alias stuff with it any time soon. 64N/A // compress paths and change unreachable cycles to TOP 64N/A // If not, we can update the input infinitely along a MergeMem cycle 64N/A // Equivalent code in PhiNode::Ideal 605N/A // If transformed to a MergeMem, get the desired slice 64N/A // Otherwise the returned node represents memory for every slice 64N/A // Update input if it is progress over what we have now 0N/A//--------------------------Ideal_common--------------------------------------- 0N/A// Look for degenerate control and memory inputs. Bypass MergeMem inputs. 0N/A// Unhook non-raw memories from complete (macro-expanded) initializations. 0N/A // If our control input is a dead region, kill all below the region 305N/A // Don't bother trying to transform a dead node 708N/A // Wait if control on the worklist. 708N/A // This control path may be dead. 708N/A // Delay this memory node transformation until the control is processed. 0N/A // Ignore if memory is dead, or self-loop 2967N/A // This memory slice may be dead. 2967N/A // Delay this mem node transformation until the memory is processed. 420N/A // The address's base and type may change when the address is processed. 420N/A // Delay this mem node transformation until the address is processed. 1062N/A // Do NOT remove or optimize the next lines: ensure a new alias index 1062N/A // is allocated for an oop pointer type before Escape Analysis. 1062N/A // Note: C++ will not remove it since the call has side effect. 0N/A // Avoid independent memory operations 36N/A // The code which unhooks non-raw memories from complete (macro-expanded) 36N/A // initializations was removed. After macro-expansion all stores catched 36N/A // by Initialize node became raw stores and there is no information 36N/A // which memory slices they modify. So it is unsafe to move any memory 36N/A // operation above these stores. Also in most cases hooked non-raw memories 36N/A // were already unhooked by using information from detect_ptr_independence() 36N/A // and find_previous_store(). 0N/A // let the subclass continue analyzing... 0N/A// Helper function for proving some simple control dominations. 119N/A// Attempt to prove that all control inputs of 'dom' dominate 'sub'. 0N/A// Already assumes that 'dom' is available at 'sub', and that 'sub' 0N/A// is not a constant (dominated by the method's StartNode). 0N/A// Used by MemNode::find_previous_store to prove that the 0N/A// control input of a memory operation predates (dominates) 0N/A// an allocation it wants to look past. 119N/A return false;
// Conservative answer for dead code 193N/A // Check 'dom'. Skip Proj and CatchProj nodes. 119N/A return false;
// Conservative answer for dead code 193N/A // For the case when, for example, 'sub' is Initialize and the original 193N/A // 'dom' is Proj node of the 'sub'. 119N/A // 'dom' dominates 'sub' if its control edge and control edges 119N/A // of all its inputs dominate or equal to sub's control edge. 119N/A // Currently 'sub' is either Allocate, Initialize or Start nodes. 163N/A // Or Region for the check in LoadNode::Ideal(); 163N/A // 'sub' should have sub->in(0) != NULL. 119N/A // Get control edge of 'sub'. 119N/A return false;
// Conservative answer for dead code 119N/A // Check all control edges of 'dom'. 193N/A return false;
// One of dom's inputs dominated by sub. 119N/A // Check only own control edge for pinned non-control nodes. 119N/A return false;
// Conservative answer for dead code 119N/A // First, own control edge. 155N/A return false;
// Conservative answer for dead code 119N/A // Now, the rest of edges. 0N/A//---------------------detect_ptr_independence--------------------------------- 0N/A// Used by MemNode::find_previous_store to prove that two base 0N/A// pointers are never equal. 0N/A// The pointers are accompanied by their associated allocations, 0N/A// if any, which have been previously discovered by the caller. 0N/A // Attempt to prove that these two pointers cannot be aliased. 0N/A // They may both manifestly be allocations, and they should differ. 0N/A // Or, if they are not both allocations, they can be distinct constants. 0N/A // Otherwise, one is an allocation and the other a pre-existing value. 0N/A }
else if (
a1 !=
NULL) {
// one allocation a1 0N/A // (Note: p2->is_Con implies p2->in(0)->is_Root, which dominates.) 0N/A }
else {
//(a2 != NULL) // one allocation a2 0N/A// The logic for reordering loads and stores uses four steps: 0N/A// (a) Walk carefully past stores and initializations which we 0N/A// can prove are independent of this load. 0N/A// (b) Observe that the next memory state makes an exact match 0N/A// with self (load or store), and locate the relevant store. 0N/A// (c) Ensure that, if we were to wire self directly to the store, 0N/A// the optimizer would fold it up somehow. 0N/A// (d) Do the rewiring, and return, depending on some other part of 0N/A// the optimizer to fold up the load. 0N/A// This routine handles steps (a) and (b). Steps (c) and (d) are 0N/A// specific to loads and stores, so they are handled by the callers. 0N/A// (Currently, only LoadNode::Ideal has steps (c), (d). More later.) 0N/A return NULL;
// cannot unalias unless there are precise offsets 0N/A int cnt =
50;
// Cycle limiter 0N/A for (;;) {
// While we can dance past unrelated stores... 0N/A if (--
cnt < 0)
break;
// Caught in cycle or a complicated dance? 0N/A break;
// inscrutable pointer 0N/A // Success: The offsets are provably independent. 0N/A // (You may ask, why not just test st_offset != offset and be done? 0N/A // The answer is that stores of different sizes can co-exist 0N/A // in the same sequence of RawMem effects. We sometimes initialize 0N/A // a whole 'tile' of array elements with a single jint or jlong.) 0N/A continue;
// (a) advance through independent store memory 0N/A // Success: The bases are provably independent. 0N/A continue;
// (a) advance through independent store memory 0N/A // (b) At this point, if the bases or offsets do not agree, we lose, 0N/A // since we have not managed to prove 'this' and 'mem' independent. 0N/A return mem;
// let caller handle steps (c), (d) 0N/A break;
// something degenerated 0N/A // The bases are provably independent: Either they are 0N/A // manifestly distinct allocations, or else the control 0N/A // of this load dominates the store's allocation. 0N/A continue;
// (a) advance through independent store memory 0N/A // (b) at this point, if we are not looking at a store initializing 0N/A // the same allocation we are loading from, we lose. 0N/A // From caller, can_see_stored_value will consult find_captured_store. 0N/A return mem;
// let caller handle steps (c), (d) 74N/A // Can't use optimize_simple_memory_chain() since it needs PhaseGVN. 74N/A continue;
// (a) advance through independent call memory 74N/A continue;
// (a) advance through independent MemBar memory 1100N/A // (the call updated 'mem' value) 1100N/A continue;
// (a) advance through independent allocation memory 1100N/A // Can not bypass initialization of the instance 74N/A continue;
// (a) advance through independent MergeMem memory 0N/A // Unless there is an explicit 'continue', we must bail out here, 0N/A // because 'mem' is an inscrutable memory state (e.g., a call). 0N/A//----------------------calculate_adr_type------------------------------------- 0N/A// Helper function. Notices when the given type of address hits top or bottom. 0N/A// Also, asserts a cross-check of the type against the expected address type. 0N/A // %%%% [phh] We don't check the alias index if cross_check is 0N/A // TypeRawPtr::BOTTOM. Needs to be investigated. 0N/A // Recheck the alias index, to see if it has changed (due to a bug). 0N/A "must stay in the original alias category");
0N/A // The type of the address must be contained in the adr_type, 0N/A // disregarding "null"-ness. 0N/A // (We make an exception for TypeRawPtr::BOTTOM, which is a bit bucket.) 0N/A "real address must not escape from expected memory type");
0N/A//------------------------adr_phi_is_loop_invariant---------------------------- 0N/A// A helper function for Ideal_DU_postCCP to check if a Phi in a counted 0N/A// loop is loop invariant. Make a quick traversal of Phi and associated 0N/A// CastPP nodes, looking to see if they are a closed group within the loop. 0N/A // The idea is that the phi-nest must boil down to only CastPP nodes 0N/A // with the same data. This implies that any path into the loop already 0N/A // includes such a CastPP, and so the original cast, whatever its input, 0N/A // must be covered by an equivalent cast, with an earlier control input. 0N/A // The loop entry input of the phi should be the unique dominating 0N/A // Add the phi node and the cast to the worklist. 0N/A // Begin recursive walk of phi nodes. 0N/A // Take a node off the worklist 0N/A // Add it to the closure. 0N/A // Make a sanity check to ensure we don't waste too much time here. 0N/A // This node is OK if: 0N/A // - it is a cast of an identical value 0N/A // - or it is a phi node (then we add its inputs to the worklist) 0N/A // Otherwise, the node is not OK, and we presume the cast is not invariant 0N/A // Quit when the worklist is empty, and we've found no offending nodes. 0N/A//------------------------------Ideal_DU_postCCP------------------------------- 0N/A// Find any cast-away of null-ness and keep its control. Null cast-aways are 0N/A// going away in this pass and we need to make this memory op depend on the 0N/A// gating null check. 0N/A// I tried to leave the CastPP's in. This makes the graph more accurate in 0N/A// some sense; we get to keep around the knowledge that an oop is not-null 0N/A// after some test. Alas, the CastPP's interfere with GVN (some values are 0N/A// the regular oop, some are the CastPP of the oop, all merge at Phi's which 0N/A// cannot collapse, etc). This cost us 10% on SpecJVM, even when I removed 0N/A// some of the more trivial cases in the optimizer. Removing more useless 0N/A// Phi's started allowing Loads to illegally float above null checks. I gave 0N/A// up on this approach. CNC 10/20/2000 163N/A// This static method may be called not from MemNode (EncodePNode calls it). 163N/A// Only the control edge of the node 'n' might be updated. 0N/A // Need a null check? Regular static accesses do not because they are 0N/A // from constant addresses. Array ops are gated by the range check (which 0N/A // always includes a NULL check). Just check field ops. 0N/A // Scan upwards for the highest location we can place this memory op. 0N/A case Op_AddP:
// No change to NULL-ness, so peek thru AddP's 3806N/A // EncodeP node's control edge could be set by this method 3806N/A // when EncodeP node depends on CastPP node. 3806N/A // Use its control edge for memory op because EncodeP may go away 3806N/A // later when it is folded with following or preceding DecodeN node. 3806N/A // Keep looking for cast nodes. 0N/A // If the CastPP is useless, just peek on through it. 0N/A // Remember the cast that we've peeked though. If we peek 0N/A // through more than one, then we end up remembering the highest 0N/A // one, that is, if in a loop, the one closest to the top. 0N/A // CastPP is going away in this pass! We need this memory op to be 0N/A // control-dependent on the test that is guarding the CastPP. 0N/A // Attempt to float above a Phi to some dominating point. 0N/A // If we've already peeked through a Cast (which could have set the 0N/A // control), we can't float above a Phi, because the skipped Cast 0N/A // may not be loop invariant. 0N/A // Intentional fallthrough! 0N/A // No obvious dominating point. The mem op is pinned below the Phi 0N/A // by the Phi itself. If the Phi goes away (no true value is merged) 0N/A // then the mem op can float, but not indefinitely. It must be pinned 0N/A // behind the controls leading to the Phi. 0N/A // These usually stick around to change address type, however a 0N/A // useless one can be elided and we still need to pick up a control edge 0N/A // This CheckCastPP node has NO control and is likely useless. But we 0N/A // need check further up the ancestor chain for a control input to keep 0N/A // the node in place. 4959717. 0N/A // List of "safe" opcodes; those that implicitly block the memory 0N/A // op below any null check. 0N/A break;
// No progress 0N/A case Op_Proj:
// Direct call to an allocation routine 0N/A // We further presume that this is one of 0N/A // new_instance_Java, new_array_Java, or 0N/A // the like, but do not assert for this. 0N/A // similar case to new_instance_Java, etc. 0N/A // Projections from fetch_oop (OSR) are allowed as well. 0N/A//============================================================================= 0N/A // standard dump does this in Verbose and WizardMode 1529N/A//----------------------------is_immutable_value------------------------------- 1529N/A// Helper function to allow a raw load without control edge for some cases 0N/A//----------------------------LoadNode::make----------------------------------- 0N/A// Polymorphic factory method: 0N/A // sanity check the alias category against the created node type 0N/A "use LoadKlassNode instead");
0N/A "use LoadRangeNode instead");
1529N/A // Check control edge of raw loads 1529N/A // oop will be recorded in oop map if load crosses safepoint 1529N/A "raw memory operations should have control edge");
0N/A//------------------------------hash------------------------------------------- 0N/A // unroll addition of interesting fields 0N/A//---------------------------can_see_stored_value------------------------------ 0N/A// This routine exists to make sure this set of tests is done the same 0N/A// everywhere. We need to make a coordinated change: first LoadNode::Ideal 0N/A// will change the graph shape in a way which makes memory alive twice at the 0N/A// same time (uses the Oracle model of aliasing), then some 0N/A// LoadXNode::Identity will fold things back to the equivalence-class model 17N/A // Skip through chains of MemBarNodes checking the MergeMems for 17N/A // new states for the slice of this load. Stop once any other 17N/A // kind of node is encountered. Loads from final memory can skip 17N/A // through any kind of MemBar but normal loads shouldn't skip 17N/A // through MemBarAcquire since the could allow them to move out of 17N/A // a synchronized region. 17N/A // Save the new memory state for the slice and fall through 0N/A // Loop around twice in the case Load -> Initialize -> Store. 0N/A // (See PhaseIterGVN::add_users_to_worklist, which knows about this case.) 0N/A // Try harder before giving up... Match raw and non-raw pointers. 0N/A // At this point we have proven something like this setup: 0N/A // A = Allocate(...) 0N/A // L = LoadQ(, AddP(CastPP(, A.Parm),, #Off)) 0N/A // S = StoreQ(, AddP(, A.Parm , #Off), V) 0N/A // (Actually, we haven't yet proven the Q's are the same.) 0N/A // In other words, we are loading from a casted version of 0N/A // the same pointer-and-offset that we stored to. 0N/A // Thus, we are able to replace L by V. 0N/A // Now prove that we have a LoadQ matched to a StoreQ, for some Q. 0N/A // A load from a freshly-created object always returns zero. 0N/A // (This can happen after LoadNode::Ideal resets the load's memory input 0N/A // to find_captured_store, which returned InitializeNode::zero_memory.) 0N/A // return a zero value for the load's basic type 0N/A // (This is one of the few places where a generic PhaseTransform 0N/A // can create new nodes. Think of it as lazily manifesting 0N/A // virtually pre-existing constants.) 0N/A // A load from an initialization barrier can match a captured store. 0N/A // examine a captured store value 0N/A continue;
// take one more trip around 64N/A//----------------------is_instance_field_load_with_local_phi------------------ 0N/A//------------------------------Identity--------------------------------------- 0N/A// Loads are identity if previous store is to same address 0N/A // If the previous store-maker is the right kind of Store, and the store is 0N/A // to the same address, then we are equal to the value stored. 0N/A // byte, short & char stores truncate naturally. 0N/A // A load has to load the truncated value which requires 0N/A // some sort of masking operation and that requires an 0N/A // Ideal call instead of an Identity call. 0N/A // If the input to the store does not fit with the load's result type, 0N/A // it must be truncated via an Ideal call. 0N/A // (This works even when value is a Con, but LoadNode::Value 0N/A // usually runs first, producing the singleton type of the Con.) 64N/A // Search for an existing data phi which was generated before for the same 605N/A // instance's field to avoid infinite generation of phis in a loop. 17N/A// Returns true if the AliasType refers to the field that holds the 17N/A// cached box array. Currently only handles the IntegerCache case. 17N/A// Fetch the base value in the autobox array 17N/A // Fetch the box object at the base of the array and get its value 17N/A // This should be true nonstatic_field_at requires calling 17N/A // nof_nonstatic_fields so check it anyway 17N/A// Returns true if the AliasType refers to the value field of an 17N/A// autobox object. Currently only handles Integer. 17N/A// We're loading from an object which has autobox behaviour. 17N/A// If this object is result of a valueOf call we'll have a phi 17N/A// merging a newly allocated object and a load from the cache. 17N/A// We want to replace this load with the original incoming 17N/A// argument to the valueOf call. 17N/A // Push the loads from the phi that comes from valueOf up 17N/A // through it to allow elimination of the loads and the recovery 17N/A // of the original value. 583N/A // Get LoadN node which loads cached Integer object 17N/A // Eliminate the load of Integer.value for integers from the cache 17N/A // array by deriving the value from the index into the array. 17N/A // Capture the offset of the load and then reverse the computation. 17N/A // Add up all the offsets making of the address of the load 17N/A // Remove the constant offset from the address and then 17N/A // remove the scaling of the offset to recover the original index. 17N/A // Peel the shift off directly but wrap it in a dummy node 17N/A // since Ideal can't return existing nodes 163N/A//------------------------------split_through_phi------------------------------ 163N/A// Split instance field load through Phi. 163N/A // Check for loop invariant. 163N/A // Do nothing here if Identity will find a value 163N/A // (to avoid infinite chain of value phis generation). 163N/A // Skip the split if the region dominates some control edge of the address. 163N/A // Alter data node to use pre-phi inputs 163N/A // Check for a 'win' on some paths 163N/A // See comments in PhaseIdealLoop::split_thru_phi(). 163N/A // We now call Identity to try to simplify the cloned node. 163N/A // Note that some Identity methods call phase->type(this). 163N/A // Make sure that the type array is big enough for 163N/A // our new node, even though we may throw the node away. 163N/A // (This tweaking with igvn only works because x is a new node.) 293N/A // If x is a TypeNode, capture any more-precise type permanently into Node 605N/A // otherwise it will be not updated during igvn->transform since 293N/A // igvn->type(x) is set to x->Value() already. 163N/A // Else x is a new node we are keeping 163N/A // We do not need register_new_node_with_optimizer 163N/A // because set_type has already been called. 0N/A//------------------------------Ideal------------------------------------------ 0N/A// If the load is from Field memory and the pointer is non-null, we can 0N/A// zero out the control input. 0N/A// If the offset is constant and the base is an object allocation, 0N/A// try to hook me up to the exact initializing store. 0N/A // Skip up past a SafePoint control. Cannot do this for Stores because 0N/A // pointer stores & cardmarks must stay on the same side of a SafePoint. 708N/A // Check for useless control edge in some common special cases 0N/A // A method-invariant, non-null address (constant or 'this' argument). 74N/A // try to optimize our memory input 2902N/A // Delay this transformation until memory Phi is processed. 163N/A // Split instance field load through Phi. 0N/A // Check for prior store with a different base or offset; make Load 0N/A // independent. Skip through any number of them. Bail out if the stores 0N/A // are in an endless dead cycle and report no progress. This is a key 0N/A // transform for Reflection. However, if after skipping through the Stores 0N/A // we can't then fold up against a prior store do NOT do the transform as 0N/A // this amounts to using the 'Oracle' model of aliasing. It leaves the same 0N/A // array memory alive twice: once for the hoisted Load and again after the 0N/A // bypassed Store. This situation only works if EVERYBODY who does 0N/A // anti-dependence work knows how to bypass. I.e. we need all 0N/A // anti-dependence checks to ask the same Oracle. Right now, that Oracle is 0N/A // the alias index stuff. So instead, peek through Stores and IFF we can 0N/A // Steps (a), (b): Walk past independent stores to find an exact match. 0N/A // (c) See if we can fold up on the spot, but don't fold up here. 0N/A // just return a prior value, which is done by Identity calls. 0N/A // Make ready for step (d): 0N/A return NULL;
// No further progress 0N/A// Helper to recognize certain Klass fields which are invariant across 0N/A// some group of array types (e.g., int[] or all T[] where T < Object). 0N/A // The field is Klass::_modifier_flags. Return its (constant) value. 0N/A // (Folds up the 2nd indirection in aClassConstant.getModifiers().) 0N/A // The field is Klass::_access_flags. Return its (constant) value. 0N/A // (Folds up the 2nd indirection in Reflection.getClassAccessFlags(aClassConstant).) 0N/A // The field is Klass::_layout_helper. Return its constant value if known. 0N/A//------------------------------Value----------------------------------------- 0N/A // Either input is TOP ==> the result is TOP 0N/A // Try to guess loaded type from pointer type 0N/A // Don't do this for integer types. There is only potential profit if 0N/A // the element type t is lower than _type; that is, for int types, if _type is 0N/A // more restrictive than t. This only happens here if one is short and the other 0N/A // char (both 16 bits), and in those cases we've made an intentional decision 0N/A // to use one kind of load over the other. See AndINode::Ideal and 4965907. 0N/A // Also, do not try to narrow the type for a LoadKlass, regardless of offset. 0N/A // Yes, it is possible to encounter an expression like (LoadKlass p1:(AddP x x 8)) 0N/A // where the _gvn.type of the AddP is wider than 8. This occurs when an earlier 0N/A // copy p0 of (AddP x x 8) has been proven equal to p1, and the p0 has been 0N/A // subsumed by p1. If p1 is on the worklist but has not yet been re-transformed, 0N/A // it is possible that p1 will have a type like Foo*[int+]:NotNull*+any. 0N/A // In fact, that could have been the original type of p1, and p1 could have 0N/A // had an original form like p1:(AddP x x (LShiftL quux 3)), where the 0N/A // expression (LShiftL quux 3) independently optimized to the constant 8. 0N/A // t might actually be lower than _type, if _type is a unique 0N/A // concrete subclass of abstract class t. 0N/A // Make sure the reference is not into the header, by comparing 0N/A // the offset against the offset of the start of the array's data. 0N/A // Different array types begin at slightly different offsets (12 vs. 16). 0N/A // We choose T_BYTE as an example base type that is least restrictive 0N/A // as to alignment, which will therefore produce the smallest 0N/A // possible base offset. 0N/A // In any case, do not allow the join, per se, to empty out the type. 0N/A // This can happen if a interface-typed array narrows to a class type. 17N/A // The pointers in the autobox arrays are always non-null 0N/A // arrays can be cast to Objects 0N/A // unsafe field access may not have a constant offset 0N/A "Field accesses must be precise" );
0N/A // For oop loads, we expect the _type to be precise 2167N/A // For constant Strings treat the final fields as compile time constants. 2736N/A // Optimizations for constant objects 2736N/A // For constant CallSites treat the target field as a compile time constant. 2736N/A // Add a dependence for invalidation of the optimization. 0N/A // arrays can be cast to Objects 0N/A // also allow array-loading from the primary supertype 0N/A // array during subtype checks 0N/A "Field accesses must be precise" );
0N/A // We are loading a field from a Klass metaobject whose identity 0N/A // is known at compile time (the type is "exact" or "precise"). 0N/A // Check for fields we know are maintained as constants by the VM. 0N/A // The field is Klass::_super_check_offset. Return its (constant) value. 0N/A // (Folds up type checking code.) 0N/A // Compute index into primary_supers array 0N/A // Check for overflowing; use unsigned compare to handle the negative case. 0N/A // The field is an element of Klass::_primary_supers. Return its (constant) value. 0N/A // (Folds up type checking code.) 0N/A // The field is arrayKlass::_component_mirror. Return its (constant) value. 0N/A // (Folds up aClassConstant.getComponentType, common in Arrays.copyOf.) 0N/A // The field is Klass::_java_mirror. Return its (constant) value. 0N/A // (Folds up the 2nd indirection in anObjConstant.getClass().) 0N/A // We can still check if we are loading from the primary_supers array at a 0N/A // shallow enough depth. Even though the klass is not exact, entries less 0N/A // than or equal to its super depth are correct. 0N/A // Compute index into primary_supers array 0N/A // Check for overflowing; use unsigned compare to handle the negative case. 0N/A // The field is an element of Klass::_primary_supers. Return its (constant) value. 0N/A // (Folds up type checking code.) 0N/A // If the type is enough to determine that the thing is not an array, 0N/A // we can give the layout_helper a positive interval type. 0N/A // This will help short-circuit some reflective code. 0N/A // Note: When interfaces are reliable, we can narrow the interface 0N/A // test to (klass != Serializable && klass != Cloneable). 0N/A // The key property of this type is that it folds up tests 0N/A // for array-ness, since it proves that the layout_helper is positive. 0N/A // Thus, a generic value like the basic object layout helper works fine. 0N/A // If we are loading from a freshly-allocated object, produce a zero, 0N/A // if the load is provably beyond the header of the object. 0N/A // (Also allow a variable load from a fresh array to produce zero.) 64N/A // If we have an instance type and our memory input is the 64N/A // programs's initial memory state, there is no matching store, 64N/A // so just return a zero of the appropriate type 0N/A//------------------------------match_edge------------------------------------- 0N/A// Do we Match on this edge index or not? Match only the address. 0N/A//--------------------------LoadBNode::Ideal-------------------------------------- 0N/A// If the previous store is to the same address as this load, 0N/A// and the value stored was larger than a byte, replace this load 0N/A// with the value stored truncated to a byte. If no truncation is 0N/A// needed, the replacement is done in LoadNode::Identity(). 0N/A // Identity call will handle the case where truncation is not needed. 3098N/A // If the input to the store does not fit with the load's result type, 3098N/A // it must be truncated. We can't delay until Ideal call since 3098N/A // a singleton Value is needed for split_thru_phi optimization. 624N/A//--------------------------LoadUBNode::Ideal------------------------------------- 624N/A// If the previous store is to the same address as this load, 624N/A// and the value stored was larger than a byte, replace this load 624N/A// with the value stored truncated to a byte. If no truncation is 624N/A// needed, the replacement is done in LoadNode::Identity(). 624N/A // Identity call will handle the case where truncation is not needed. 3098N/A // If the input to the store does not fit with the load's result type, 3098N/A // it must be truncated. We can't delay until Ideal call since 3098N/A // a singleton Value is needed for split_thru_phi optimization. 558N/A//--------------------------LoadUSNode::Ideal------------------------------------- 0N/A// If the previous store is to the same address as this load, 0N/A// and the value stored was larger than a char, replace this load 0N/A// with the value stored truncated to a char. If no truncation is 0N/A// needed, the replacement is done in LoadNode::Identity(). 0N/A // Identity call will handle the case where truncation is not needed. 3098N/A // If the input to the store does not fit with the load's result type, 3098N/A // it must be truncated. We can't delay until Ideal call since 3098N/A // a singleton Value is needed for split_thru_phi optimization. 0N/A//--------------------------LoadSNode::Ideal-------------------------------------- 0N/A// If the previous store is to the same address as this load, 0N/A// and the value stored was larger than a short, replace this load 0N/A// with the value stored truncated to a short. If no truncation is 0N/A// needed, the replacement is done in LoadNode::Identity(). 0N/A // Identity call will handle the case where truncation is not needed. 3098N/A // If the input to the store does not fit with the load's result type, 3098N/A // it must be truncated. We can't delay until Ideal call since 3098N/A // a singleton Value is needed for split_thru_phi optimization. 0N/A//============================================================================= 164N/A//----------------------------LoadKlassNode::make------------------------------ 164N/A// Polymorphic factory method: 164N/A // sanity check the alias category against the created node type 0N/A//------------------------------Value------------------------------------------ 0N/A // Either input is TOP ==> the result is TOP 0N/A // Return a more precise klass, if possible 0N/A // We are loading a special hidden field from a Class mirror object, 0N/A // the field which points to the VM's Klass metaobject. 0N/A // java_mirror_type returns non-null for compile-time Class constants. 0N/A // constant oop => constant klass 0N/A // a primitive Class (e.g., int.class) has NULL for a klass field 0N/A // (Folds up the 1st indirection in aClassConstant.getModifiers().) 0N/A // non-constant mirror, so we can't tell what's going on 0N/A // See if we can become precise: no subklasses and no interface 0N/A // (Note: We need to support verified interfaces.) 0N/A //assert(!UseExactTypes, "this code should be useless with exact types"); 0N/A // Add a dependence; if any subclass added we need to recompile 0N/A // %%% should use stronger assert_unique_concrete_subtype instead 0N/A // Return precise klass 0N/A // Return root of possible klass 0N/A // Check for loading klass from an array 0N/A // If the klass is an object array, we defer the question to the 0N/A // array component klass. 0N/A // See if we can become precise: no subklasses and no interface 0N/A //assert(!UseExactTypes, "this code should be useless with exact types"); 0N/A // Add a dependence; if any subclass added we need to recompile 0N/A // Return precise array klass 0N/A }
else {
// Found a type-array? 0N/A //assert(!UseExactTypes, "this code should be useless with exact types"); 0N/A // Check for loading klass from an array klass 0N/A // // Always returning precise element type is incorrect, 0N/A // // e.g., element type could be object and array may contain strings 0N/A // return TypeKlassPtr::make(TypePtr::Constant, elem, 0); 0N/A // The array's TypeKlassPtr was declared 'precise' or 'not precise' 0N/A // according to the element type's subclassing. 0N/A // The field is Klass::_super. Return its (constant) value. 0N/A // (Folds up the 2nd indirection in aClassConstant.getSuperClass().) 0N/A//------------------------------Identity--------------------------------------- 0N/A// To clean up reflective code, simplify k.java_mirror.as_klass to plain k. 0N/A// Also feed through the klass in Allocate(...klass...)._klass. 0N/A if (x !=
this)
return x;
0N/A // Take apart the address into an oop and and offset. 0N/A // Return 'this' if we cannot. 0N/A // We can fetch the klass directly through an AllocateNode. 0N/A // This works even if the klass is not constant (clone or newArray). 0N/A // Simplify k.java_mirror.as_klass to plain k, where k is a klassOop. 0N/A // Simplify ak.component_mirror.array_klass to plain ak, ak an arrayKlass. 0N/A // See inline_native_Class_query for occurrences of these patterns. 0N/A // Java Example: x.getClass().isAssignableFrom(y) 0N/A // Java Example: Array.newInstance(x.getClass().getComponentType(), n) 0N/A // This improves reflective code, often making the Class 0N/A // mirror go completely dead. (Current exception: Class 0N/A // mirrors may appear in debug info, but we could clean them out by 0N/A // We are loading a special hidden field from a Class mirror, 0N/A // the field which points to its Klass or arrayKlass metaobject. 164N/A//------------------------------Value------------------------------------------ 164N/A//------------------------------Identity--------------------------------------- 164N/A// To clean up reflective code, simplify k.java_mirror.as_klass to narrow k. 164N/A// Also feed through the klass in Allocate(...klass...)._klass. 0N/A//------------------------------Value----------------------------------------- 0N/A // Either input is TOP ==> the result is TOP 366N/A//-------------------------------Ideal--------------------------------------- 366N/A// Feed through the length in AllocateArray(...length...)._length. 366N/A // Take apart the address into an oop and and offset. 366N/A // Return 'this' if we cannot. 366N/A // We can fetch the length directly through an AllocateArrayNode. 366N/A // This works even if the length is not constant (clone or newArray). 366N/A // New CastII improves on this. 0N/A//------------------------------Identity--------------------------------------- 0N/A// Feed through the length in AllocateArray(...length...)._length. 0N/A if (x !=
this)
return x;
0N/A // Take apart the address into an oop and and offset. 0N/A // Return 'this' if we cannot. 0N/A // We can fetch the length directly through an AllocateArrayNode. 0N/A // This works even if the length is not constant (clone or newArray). 366N/A // Do not allow make_ideal_length to allocate a CastII node. 366N/A // Return allocated_length only if it would not be improved by a CastII. 0N/A//============================================================================= 0N/A//---------------------------StoreNode::make----------------------------------- 0N/A// Polymorphic factory method: 1529N/A ctl !=
NULL,
"raw memory operations should have control edge");
0N/A//--------------------------bottom_type---------------------------------------- 0N/A//------------------------------hash------------------------------------------- 0N/A // unroll addition of interesting fields 0N/A //return (uintptr_t)in(Control) + (uintptr_t)in(Memory) + (uintptr_t)in(Address) + (uintptr_t)in(ValueIn); 0N/A // Since they are not commoned, do not hash them: 0N/A//------------------------------Ideal------------------------------------------ 0N/A// Change back-to-back Store(, p, x) -> Store(m, p, y) to Store(m, p, x). 0N/A// try to capture it into the initialization, or hoist it above. 2345N/A // Back-to-back stores to same address? Fold em up. Generally 2345N/A // unsafe if I have intervening uses... Also disallowed for StoreCM 2345N/A // since they must follow each StoreP operation. Redundant StoreCMs 2345N/A // are eliminated just before matching in final_graph_reshape. 0N/A // Looking at a dead closed cycle of memory? 0N/A "no mismatched stores, except on raw memory");
0N/A // If anybody other than 'this' uses 'mem', we cannot fold 'mem' away. 0N/A // For example, 'mem' might be the final state at a conditional return. 0N/A // Or, 'mem' might be used by some node which is live at the same time 0N/A // 'this' is live, which might be unschedulable. So, require exactly 0N/A // ONE user, the 'this' store, until such time as we clone 'mem' for 0N/A // each of 'mem's uses (thus making the exactly-1-user-rule hold true). 0N/A // It's OK to do this in the parser, since DU info is always accurate, 0N/A // and the parser always refers to nodes via SafePointNode maps. 0N/A // Capture an unaliased, unconditional, simple store into an initializer. 0N/A // Or, if it is independent of the allocation, hoist it above the allocation. 0N/A // If the InitializeNode captured me, it made a raw copy of me, 0N/A // and I need to disappear. 0N/A // %%% hack to ensure that Ideal returns a new node: 0N/A return NULL;
// No further progress 0N/A//------------------------------Value----------------------------------------- 0N/A // Either input is TOP ==> the result is TOP 0N/A//------------------------------Identity--------------------------------------- 0N/A// Remove redundant stores: 0N/A// Store(m, p, Load(m, p)) changes to m. 0N/A// Store(, p, x) -> Store(m, p, x) changes to Store(m, p, x). 0N/A // Load then Store? Then the Store is useless 0N/A // Two stores in a row of the same value? 0N/A // Store of zero anywhere into a freshly-allocated object? 0N/A // Then the store is useless. 0N/A // (It must already have been captured by the InitializeNode.) 0N/A // a newly allocated object is already all-zeroes everywhere 0N/A // the store may also apply to zero-bits in an earlier object 0N/A // Steps (a), (b): Walk past independent stores to find an exact match. 0N/A // prev_val and val might differ by a cast; it would be good 0N/A // to keep the more informative of the two. 0N/A//------------------------------match_edge------------------------------------- 0N/A// Do we Match on this edge index or not? Match only memory & value 0N/A//------------------------------cmp-------------------------------------------- 0N/A// Do not common stores up together. They generally have to be split 0N/A// back up anyways, so do not bother. 0N/A return (&n ==
this);
// Always fail except on self 0N/A//------------------------------Ideal_masked_input----------------------------- 0N/A// Check for a useless mask before a partial-word store 0N/A// (StoreB ... (AndI valIn conIa) ) 0N/A// If (conIa & mask == mask) this simplifies to 0N/A// (StoreB ... (valIn) ) 0N/A//------------------------------Ideal_sign_extended_input---------------------- 0N/A// Check for useless sign-extension before a partial-word store 0N/A// (StoreB ... (RShiftI _ (LShiftI _ valIn conIL ) conIR) ) 0N/A// If (conIL == conIR && conIR <= num_bits) this simplifies to 0N/A// (StoreB ... (valIn) ) 0N/A//------------------------------value_never_loaded----------------------------------- 0N/A// Determine whether there are any possible loads of the value stored. 0N/A// For simplicity, we actually check if there are any loads from the 0N/A// address stored to, not just for loads of the value stored by this node. 0N/A return false;
// if not a distinct instance, there may be aliases of the address 0N/A//============================================================================= 0N/A//------------------------------Ideal------------------------------------------ 0N/A// If the store is from an AND mask that leaves the low bits untouched, then 0N/A// we can skip the AND operation. If the store is from a sign-extension 0N/A// (a left shift, then right shift) we can skip both. 0N/A // Finally check the default case 0N/A//============================================================================= 0N/A//------------------------------Ideal------------------------------------------ 0N/A// If the store is from an AND mask that leaves the low bits untouched, then 0N/A// we can skip the AND operation 0N/A // Finally check the default case 0N/A//============================================================================= 0N/A//------------------------------Identity--------------------------------------- 0N/A // No need to card mark when storing a null ptr 985N/A//============================================================================= 985N/A//------------------------------Ideal--------------------------------------- 0N/A//------------------------------Value----------------------------------------- 43N/A // Either input is TOP ==> the result is TOP 0N/A // If extra input is TOP ==> the result is TOP 0N/A//============================================================================= 0N/A//----------------------------------SCMemProjNode------------------------------ 0N/A//============================================================================= 0N/A//============================================================================= 0N/A//-------------------------------adr_type-------------------------------------- 0N/A// Do we Match on this edge index or not? Do not match memory 0N/A//------------------------------match_edge------------------------------------- 0N/A// Do we Match on this edge index or not? Do not match memory 0N/A//------------------------------Identity--------------------------------------- 0N/A// Clearing a zero length array does nothing 0N/A//------------------------------Idealize--------------------------------------- 0N/A// Clearing a short array is faster with stores 0N/A // Clearing nothing uses the Identity call. 0N/A // Negative clears are possible on dead ClearArrays 0N/A // Length too long; use fast hardware clear 0N/A // adjust atp to be the correct array element address type 0N/A // Get base for derived pointer purposes 1100N/A//----------------------------step_through---------------------------------- 1100N/A// Return allocation input memory edge if it is different instance 1100N/A// or itself if it is the one we are looking for. 1100N/A // This method is called only before Allocate nodes are expanded during 1100N/A // macro nodes expansion. Before that ClearArray nodes are only generated 1100N/A // in LibraryCallKit::generate_arraycopy() which follows allocations. 1100N/A // Can not bypass initialization of the instance we are looking for. 0N/A//----------------------------clear_memory------------------------------------- 0N/A// Generate code to initialize object storage to zero. 0N/A // Initialize the remaining stuff, if any, with a ClearArray. 0N/A // Scale to the unit required by the CPU: 0N/A // Bulk clear double-words 0N/A//============================================================================= 2259N/A// Do not match memory edge. 681N/A//------------------------------Ideal------------------------------------------ 681N/A// Return a node which is more "ideal" than the current node. Strip out 2958N/A // Don't bother trying to transform a dead node 2264N/A // If transformed to a MergeMem, get the desired slice 2958N/A//------------------------------Value------------------------------------------ 0N/A//============================================================================= 0N/A//------------------------------cmp-------------------------------------------- 0N/A return (&n ==
this);
// Always fail except on self 0N/A//------------------------------make------------------------------------------- 0N/A//------------------------------Ideal------------------------------------------ 0N/A// Return a node which is more "ideal" than the current node. Strip out 2958N/A // Don't bother trying to transform a dead node 1100N/A // Eliminate volatile MemBars for scalar replaced objects. 1100N/A // Volatile field loads and stores. 1100N/A // Check for scalar replaced object reference. 1100N/A // Replace MemBar projections by its inputs. 1100N/A // Must return either the original node (now dead) or a new node 1100N/A // (Do not return a top here, since that would break the uniqueness of top.) 0N/A//------------------------------Value------------------------------------------ 0N/A//------------------------------match------------------------------------------ 0N/A// Construct projections for memory. 0N/A//===========================InitializeNode==================================== 0N/A// This node acts as a memory barrier on raw memory, after some raw stores. 0N/A// The 'cooked' oop value feeds from the Initialize, not the Allocation. 0N/A// The Initialize can 'capture' suitably constrained stores as raw inits. 0N/A// It can coalesce related raw stores into larger units (called 'tiles'). 0N/A// It can avoid zeroing new storage for memory units which have raw inits. 0N/A// At macro-expansion, it is marked 'complete', and does not optimize further. 0N/A// The object 'new short[2]' occupies 16 bytes in a 32-bit machine. 0N/A// ctl = incoming control; mem* = incoming memory 0N/A// (Note: A star * on a memory edge denotes I/O and other standard edges.) 0N/A// First allocate uninitialized memory and fill in the header: 0N/A// alloc = (Allocate ctl mem* 16 #short[].klass ...) 0N/A// ctl := alloc.Control; mem* := alloc.Memory* 0N/A// rawmem = alloc.Memory; rawoop = alloc.RawAddress 0N/A// Then initialize to zero the non-header parts of the raw memory block: 0N/A// init = (Initialize alloc.Control alloc.Memory* alloc.RawAddress) 0N/A// ctl := init.Control; mem.SLICE(#short[*]) := init.Memory 0N/A// After the initialize node executes, the object is ready for service: 0N/A// oop := (CheckCastPP init.Control alloc.RawAddress #short[]) 0N/A// Suppose its body is immediately initialized as {1,2}: 0N/A// store1 = (StoreC init.Control init.Memory (+ oop 12) 1) 0N/A// store2 = (StoreC init.Control store1 (+ oop 14) 2) 0N/A// mem.SLICE(#short[*]) := store2 0N/A// An InitializeNode collects and isolates object initialization after 0N/A// an AllocateNode and before the next possible safepoint. As a 0N/A// memory barrier (MemBarNode), it keeps critical stores from drifting 0N/A// down past any safepoint or any publication of the allocation. 0N/A// Before this barrier, a newly-allocated object may have uninitialized bits. 0N/A// After this barrier, it may be treated as a real oop, and GC is allowed. 0N/A// The semantics of the InitializeNode include an implicit zeroing of 0N/A// the new object from object header to the end of the object. 0N/A// (The object header and end are determined by the AllocateNode.) 0N/A// Certain stores may be added as direct inputs to the InitializeNode. 0N/A// These stores must update raw memory, and they must be to addresses 0N/A// derived from the raw address produced by AllocateNode, and with 0N/A// a constant offset. They must be ordered by increasing offset. 0N/A// The first one is at in(RawStores), the last at in(req()-1). 0N/A// Unlike most memory operations, they are not linked in a chain, 0N/A// but are displayed in parallel as users of the rawmem output of 0N/A// (See comments in InitializeNode::capture_store, which continue 0N/A// the example given above.) 0N/A// When the associated Allocate is macro-expanded, the InitializeNode 0N/A// may be rewritten to optimize collected stores. A ClearArrayNode 0N/A// may also be created at that point to represent any required zeroing. 0N/A// The InitializeNode is then marked 'complete', prohibiting further 0N/A// capturing of nearby memory operations. 0N/A// During macro-expansion, all captured initializations which store 605N/A// constant values of 32 bits or smaller are coalesced (if advantageous) 0N/A// into larger 'tiles' 32 or 64 bits. This allows an object to be 0N/A// initialized in fewer memory operations. Memory words which are 0N/A// covered by neither tiles nor non-constant stores are pre-zeroed 0N/A// by explicit stores of zero. (The code shape happens to do all 0N/A// zeroing first, then all other stores, with both sequences occurring 0N/A// in order of ascending offsets.) 0N/A// Alternatively, code may be inserted between an AllocateNode and its 0N/A// InitializeNode, to perform arbitrary initialization of the new object. 0N/A// E.g., the object copying intrinsics insert complex data transfers here. 0N/A// The initialization must then be marked as 'complete' disable the 0N/A// built-in zeroing semantics and the collection of initializing stores. 0N/A// While an InitializeNode is incomplete, reads from the memory state 0N/A// produced by it are optimizable if they match the control edge and 0N/A// They return a stored value (if the offset matches) or else zero. 0N/A// A write to the memory state, if it matches control and address, 0N/A// and if it is to a constant offset, may be 'captured' by the 0N/A// InitializeNode. It is cloned as a raw memory operation and rewired 0N/A// inside the initialization, to the raw oop produced by the allocation. 0N/A// Operations on addresses which are provably distinct (e.g., to 0N/A// other AllocateNodes) are allowed to bypass the initialization. 0N/A// The effect of all this is to consolidate object initialization 0N/A// (both arrays and non-arrays, both piecewise and bulk) into a 0N/A// single location, where it can be optimized as a unit. 0N/A// Only stores with an offset less than TrackedInitializationLimit words 0N/A// will be considered for capture by an InitializeNode. This puts a 0N/A// reasonable limit on the complexity of optimized initializations. 0N/A//---------------------------InitializeNode------------------------------------ 0N/A // Note: allocation() can be NULL, for secondary initialization barriers 0N/A// Since this node is not matched, it will be processed by the 0N/A// register allocator. Declare that there are no constraints 0N/A// on the allocation of the RawAddress edge. 0N/A // This edge should be set to top, by the set_complete. But be conservative. 0N/A // incoming raw memory is not split 0N/A // After this node is complete, it contains a bunch of 0N/A // raw-memory initializations. There is no need for 0N/A // it to have anything to do with non-raw memory effects. 0N/A // Therefore, tell all non-raw users to re-optimize themselves, 0N/A // after skipping the memory effects of this initialization. 0N/A// convenience function 0N/A// return false if the init contains any stores already 0N/A // for now, if this allocation has already collected any inits, bail: 0N/A // delete any empty spaces created: 0N/A// Helper for remembering which stores go with which offsets. 0N/A if (!
st->
is_Store())
return -
1;
// can happen to dead code via subsume_node 0N/A// Helper for proving that an initialization expression is 0N/A// "simple enough" to be folded into an object initialization. 0N/A// Attempts to prove that a store's initial value 'n' can be captured 0N/A// within the initialization without creating a vicious cycle, such as: 0N/A// { Foo p = new Foo(); p.next = p; } 0N/A// True for constants and parameters and small combinations thereof. 0N/A if (n ==
NULL)
return true;
// (can this really happen?) 0N/A if (n ==
this)
return false;
// found a cycle 0N/A if (n->
is_Start())
return true;
// params, etc., are OK 0N/A if (
ctl ==
this)
return false;
0N/A // If we already know that the enclosing memory op is pinned right after 0N/A // the init, then any control flow that the store has picked up 0N/A // must have preceded the init, or else be equal to the init. 0N/A // Even after loop optimizations (which might change control edges) 0N/A // a store is never pinned *before* the availability of its inputs. 0N/A return false;
// failed to prove a good control 0N/A // Check data edges for possible dependencies on 'this'. 0N/A if ((
count +=
1) >
20)
return false;
// complexity limit 0N/A if (i !=
first_i)
continue;
// process duplicate edge just once 0N/A// Here are all the checks a Store must pass before it can be moved into 0N/A// an initialization. Returns zero if a check fails. 0N/A// On success, returns the (constant) offset to which the store applies, 0N/A// within the initialized memory. 0N/A return FAIL;
// an inscrutable StoreNode (card mark?) 0N/A return FAIL;
// must be unconditional after the initialization 0N/A return FAIL;
// must not be preceded by other stores 0N/A return FAIL;
// inscrutable address 0N/A return FAIL;
// wrong allocation! (store needs to float up) 0N/A return FAIL;
// stored value must be 'simple enough' 0N/A// Find the captured store in(i) which corresponds to the range 0N/A// If there is one, return its index i. If there isn't, return the 0N/A// negative of the index where it should be inserted. 0N/A// Return 0 if the queried range overlaps an initialization boundary 0N/A// or if dead code is encountered. 0N/A// If size_in_bytes is zero, do not bother with overlap checks. 0N/A return FAIL;
// arraycopy got here first; punt 0N/A // no negatives, no header fields: 0N/A // after a certain size, we bail out on tracking all the stores: 0N/A if (i >=
limit)
return -(
int)i;
// not found; here is where to put it 0N/A return FAIL;
// bail out if there is dead garbage 0N/A // ...we are done, since stores are ordered 0N/A return FAIL;
// the next store overlaps 0N/A return -(
int)i;
// not found; here is where to put it 0N/A return FAIL;
// the previous store overlaps 0N/A return FAIL;
// mismatched store size 0N/A// Look for a captured store which initializes at the offset 'start' 0N/A// with the given size. If there is no such store, and no other 0N/A// initialization interferes, then return zero_memory (the memory 0N/A// projection of the AllocateNode). 0N/A Node*
st =
in(i);
// here is the store at this position 0N/A// Create, as a raw pointer, an address within my new object at 'offset'. 0N/A// Clone the given store, converting it into a raw store 0N/A// initializing a field or element of my new object. 0N/A// Caller is responsible for retiring the original store, 0N/A// with subsume_node or the like. 0N/A// From the example above InitializeNode::InitializeNode, 0N/A// here are the old stores to be captured: 0N/A// store1 = (StoreC init.Control init.Memory (+ oop 12) 1) 0N/A// store2 = (StoreC init.Control store1 (+ oop 14) 2) 0N/A// Here is the changed code; note the extra edges on init: 0N/A// alloc = (Allocate ...) 0N/A// rawoop = alloc.RawAddress 0N/A// rawstore1 = (StoreC alloc.Control alloc.Memory (+ rawoop 12) 1) 0N/A// rawstore2 = (StoreC alloc.Control alloc.Memory (+ rawoop 14) 2) 0N/A// init = (Initialize alloc.Control alloc.Memory rawoop 0N/A// rawstore1 rawstore2) 0N/A if (i == 0)
return NULL;
// bail out 0N/A prev_mem =
in(i);
// there is a pre-existing store under this one 0N/A // See StoreNode::Ideal 'st->outcnt() == 1' for the reason to disconnect. 0N/A i = -i;
// no pre-existing store 0N/A set_req(--i, C->
top());
// reuse this edge; it has been folded away 0N/A // At this point, new_st might have swallowed a pre-existing store 0N/A // at the same offset, or perhaps new_st might have disappeared, 0N/A // if it redundantly stored the same value (or zero to fresh memory). 0N/A // In any case, wire it in: 0N/A // The caller may now kill the old guy. 0N/A return false;
// strange store offset (assume size==2**N) 0N/A default:
return false;
// strange store size (detect size!=2**N here) 0N/A return true;
// return success to caller 0N/A// Coalesce subword constants into int constants and possibly 0N/A// into long constants. The goal, if the CPU permits, 0N/A// is to initialize the object with a small number of 64-bit tiles. 0N/A// Also, convert floating-point constants to bit patterns. 0N/A// Non-constants are not relevant to this pass. 0N/A// In terms of the running example on InitializeNode::InitializeNode 0N/A// and InitializeNode::capture_store, here is the transformation 0N/A// of rawstore1 and rawstore2 into rawstore12: 0N/A// alloc = (Allocate ...) 0N/A// rawoop = alloc.RawAddress 0N/A// tile12 = 0x00010002 0N/A// rawstore12 = (StoreI alloc.Control alloc.Memory (+ rawoop 12) tile12) 0N/A// init = (Initialize alloc.Control alloc.Memory rawoop rawstore12) 0N/A // Note: After this pass, they are not completely sane, 0N/A // since there may be some overlaps. 0N/A // allocate space for the tile map: 0N/A // tiles: exact bitwise model of all primitive constants 0N/A // nodes: last constant-storing node subsumed into the tiles model 0N/A // inits: which bytes (in each tile) are touched by any initializations 0N/A //// Pass A: Fill in the tile model with any relevant stores. 0N/A // Figure out the store's offset and constant value: 0N/A // Record which bytes are touched, whether by constant or not. 0N/A continue;
// skip (strange store size) 0N/A default:
continue;
//skip (odd store type) 0N/A continue;
// This StoreL is already optimal. 0N/A // Store down the constant. 0N/A // This StoreI is already optimal by itself. 0N/A // If the previous store is also optimal by itself, back up and 0N/A // undo the action of the previous loop iteration... if we can. 0N/A // But if we can't, just let the previous half take care of itself. 0N/A // Undo the effects of the previous loop trip, which swallowed st: 0N/A continue;
// This StoreI is already optimal. 0N/A // This store is not needed. 0N/A return;
// nothing more to do 0N/A //// Pass B: Convert any non-zero tiles into optimal constant stores. 0N/A // Be sure to insert them before overlapping non-constant stores. 0N/A // (E.g., byte[] x = { 1,2,y,4 } => x[int 0] = 0x01020004, x[2]=y.) 0N/A split =
true;
// only the second word counts 0N/A // Example: int a[] = { 42 ... } 0N/A split =
true;
// first word is covered by full inits 0N/A // Example: int a[] = { ... foo(), 42 ... } 0N/A split =
true;
// second word is covered by full inits 0N/A // Example: int a[] = { ... 42, foo() ... } 0N/A // Here's a case where init0 is neither 0 nor -1: 0N/A // byte a[] = { ... 0,0,foo(),0, 0,0,0,42 ... } 0N/A // Assuming big-endian memory, init0, init1 are 0x0000FF00, 0x000000FF. 0N/A // In this case the tile is not split; it is (jlong)42. 0N/A // The big tile is stored down, and then the foo() value is inserted. 0N/A // (If there were foo(),foo() instead of foo(),0, init0 would be -1.) 0N/A // One or two coalesced stores to plop down. 0N/A // Omit either if it is a zero. 0N/A // Insert second store first, then the first before the second. 0N/A // Insert each one just before any overlapping non-constant stores. 0N/A // Clean up any remaining occurrences of zmem: 0N/A// Explore forward from in(start) to find the first fully initialized 0N/A// word, and return its offset. Skip groups of subword stores which 0N/A// together initialize full words. If in(start) is itself part of a 0N/A// fully initialized word, return the offset of in(start). If there 0N/A// are no following full-word stores, or if something is fishy, return 0N/A if (
st_off < 0)
break;
// return conservative answer 0N/A return st_off;
// we found a complete word init 0N/A // Did this store hit or cross the word boundary? 0N/A // We passed the current int, without fully initializing it. 0N/A // We passed the current and next int. 0N/A// Called when the associated AllocateNode is expanded into CFG. 0N/A// At this point, we may perform additional optimizations. 0N/A// Linearize the stores by ascending offset, to make memory 0N/A// activity as coherent as possible. 0N/A // reduce instruction count for common initialization patterns 0N/A bool do_zeroing =
true;
// we might give up if inits are very sparse 0N/A break;
// unknown junk in the inits 0N/A break;
// complicated store chains somehow in list 0N/A // See if this store needs a zero before it or under it. 0N/A // Look for subword stores which only partially initialize words. 0N/A // If we find some, we must lay down some word-level zeroes first, 0N/A // underneath the subword stores. 0N/A // byte[] a = { p,q,r,s } => a[0]=p,a[1]=q,a[2]=r,a[3]=s 0N/A // byte[] a = { x,y,0,0 } => a[0..3] = 0, a[0]=x,a[1]=y 0N/A // byte[] a = { 0,0,z,0 } => a[0..3] = 0, a[2]=z 0N/A // Note: coalesce_subword_stores may have already done this, 0N/A // if it was prompted by constant non-zero subword initializers. 0N/A // But this case can still arise with non-constant stores. 0N/A // In the examples above: 0N/A // in(i) p q r s x y z 0N/A // st_off 12 13 14 15 12 13 14 0N/A // st_size 1 1 1 1 1 1 1 0N/A // next_full_s. 12 16 16 16 16 16 16 0N/A // z's_done 12 16 16 16 12 16 12 0N/A // z's_needed 12 16 16 16 16 16 16 0N/A // zsize 0 0 0 0 4 0 4 0N/A // Conservative tack: Zero to end of current word. 0N/A // Zero to beginning of next fully initialized word. 0N/A // Or, don't zero at all, if we are already in that word. 0N/A // Do some incremental zeroing on rawmem, in parallel with inits. 0N/A // Collect the store and move on: 0N/A // Various order invariants. Weaker than stores_are_sane because 0N/A // a large constant tile can be filled in by smaller non-constant stores. 0N/A // If anything remains to be zeroed, zero it all now. 0N/A // if it is the last unused 4 bytes of an instance, forget about it 0N/A return true;
// stores could be anything at this point 0N/A if (
st_off < 0)
continue;
// ignore dead garbage 0N/A//============================MergeMemNode===================================== 0N/A// SEMANTICS OF MEMORY MERGES: A MergeMem is a memory state assembled from several 0N/A// contributing store or call operations. Each contributor provides the memory 0N/A// state for a particular "alias type" (see Compile::alias_type). For example, 0N/A// if a MergeMem has an input X for alias category #6, then any memory reference 0N/A// to alias category #6 may use X as its memory state input, as an exact equivalent 0N/A// to using the MergeMem as a whole. 0N/A// Load<6>( MergeMem(<6>: X, ...), p ) <==> Load<6>(X,p) 0N/A// (Here, the <N> notation gives the index of the relevant adr_type.) 0N/A// In one special case (and more cases in the future), alias categories overlap. 0N/A// The special alias category "Bot" (Compile::AliasIdxBot) includes all memory 0N/A// states. Therefore, if a MergeMem has only one contributing input W for Bot, 0N/A// it is exactly equivalent to that state W: 0N/A// MergeMem(<Bot>: W) <==> W 0N/A// Usually, the merge has more than one input. In that case, where inputs 0N/A// overlap (i.e., one is Bot), the narrower alias type determines the memory 0N/A// state for that type, and the wider alias type (Bot) fills in everywhere else: 0N/A// Load<5>( MergeMem(<Bot>: W, <6>: X), p ) <==> Load<5>(W,p) 0N/A// Load<6>( MergeMem(<Bot>: W, <6>: X), p ) <==> Load<6>(X,p) 0N/A// A merge can take a "wide" memory state as one of its narrow inputs. 0N/A// This simply means that the merge observes out only the relevant parts of 0N/A// the wide input. That is, wide memory states arriving at narrow merge inputs 0N/A// are implicitly "filtered" or "sliced" as necessary. (This is rare.) 0N/A// These rules imply that MergeMem nodes may cascade (via their <Bot> links), 0N/A// and that memory slices "leak through": 0N/A// MergeMem(<Bot>: MergeMem(<Bot>: W, <7>: Y)) <==> MergeMem(<Bot>: W, <7>: Y) 0N/A// But, in such a cascade, repeated memory slices can "block the leak": 0N/A// MergeMem(<Bot>: MergeMem(<Bot>: W, <7>: Y), <7>: Y') <==> MergeMem(<Bot>: W, <7>: Y') 0N/A// In the last example, Y is not part of the combined memory state of the 0N/A// outermost MergeMem. The system must, of course, prevent unschedulable 0N/A// memory states from arising, so you can be sure that the state Y is somehow 0N/A// a precursor to state Y'. 0N/A// REPRESENTATION OF MEMORY MERGES: The indexes used to address the Node::in array 0N/A// of each MergeMemNode array are exactly the numerical alias indexes, including 0N/A// but not limited to AliasIdxTop, AliasIdxBot, and AliasIdxRaw. The functions 0N/A// Compile::alias_type (and kin) produce and manage these indexes. 0N/A// By convention, the value of in(AliasIdxTop) (i.e., in(1)) is always the top node. 0N/A// (Note that this provides quick access to the top node inside MergeMem methods, 0N/A// without the need to reach out via TLS to Compile::current.) 0N/A// As a consequence of what was just described, a MergeMem that represents a full 0N/A// memory state has an edge in(AliasIdxBot) which is a "wide" memory state, 0N/A// containing all alias categories. 0N/A// MergeMem nodes never (?) have control inputs, so in(0) is NULL. 0N/A// All other edges in(N) (including in(AliasIdxRaw), which is in(3)) are either 0N/A// a memory state for the alias type <N>, or else the top node, meaning that 0N/A// there is no particular input for that alias type. Note that the length of 0N/A// a MergeMem is variable, and may be extended at any time to accommodate new 0N/A// memory states at larger alias indexes. When merges grow, they are of course 0N/A// filled with "top" in the unused in() positions. 0N/A// This use of top is named "empty_memory()", or "empty_mem" (no-memory) as a variable. 0N/A// (Top was chosen because it works smoothly with passes like GCM.) 0N/A// For convenience, we hardwire the alias index for TypeRawPtr::BOTTOM. (It is 0N/A// the type of random VM bits like TLS references.) Since it is always the 0N/A// first non-Bot memory slice, some low-level loops use it to initialize an 0N/A// index variable: for (i = AliasIdxRaw; i < req(); i++). 0N/A// ACCESSORS: There is a special accessor MergeMemNode::base_memory which returns 0N/A// the distinguished "wide" state. The accessor MergeMemNode::memory_at(N) returns 0N/A// the memory state for alias type <N>, or (if there is no particular slice at <N>, 0N/A// it returns the base memory. To prevent bugs, memory_at does not accept <Top> 0N/A// or <Bot> indexes. The iterator MergeMemStream provides robust iteration over 0N/A// MergeMem nodes or pairs of such nodes, ensuring that the non-top edges are visited. 0N/A// %%%% We may get rid of base_memory as a separate accessor at some point; it isn't 0N/A// really that different from the other memory inputs. An abbreviation called 0N/A// "bot_memory()" for "memory_at(AliasIdxBot)" would keep code tidy. 0N/A// PARTIAL MEMORY STATES: During optimization, MergeMem nodes may arise that represent 0N/A// partial memory states. When a Phi splits through a MergeMem, the copy of the Phi 0N/A// that "emerges though" the base memory will be marked as excluding the alias types 0N/A// of the other (narrow-memory) copies which "emerged through" the narrow edges: 0N/A// Phi<Bot>(U, MergeMem(<Bot>: W, <8>: Y)) 0N/A// ==Ideal=> MergeMem(<Bot>: Phi<Bot-8>(U, W), Phi<8>(U, Y)) 0N/A// This strange "subtraction" effect is necessary to ensure IGVN convergence. 0N/A// (It is currently unimplemented.) As you can see, the resulting merge is 0N/A// actually a disjoint union of memory states, rather than an overlay. 0N/A//------------------------------MergeMemNode----------------------------------- 0N/A // all inputs are nullified in Node::Node(int) 0N/A // set_input(0, NULL); // no control input 0N/A // Initialize the edges uniformly to top, for starters. 0N/A// Make a new, untransformed MergeMem with the same base as 'mem'. 0N/A// If mem is itself a MergeMem, populate the result with the same edges. 0N/A//------------------------------cmp-------------------------------------------- 0N/A return (&n ==
this);
// Always fail except on self 0N/A//------------------------------Identity--------------------------------------- 0N/A // Identity if this merge point does not record any interesting memory 0N/A return this;
// Many memory splits; no change 0N/A return base_mem;
// No memory splits; ID on the one true input 0N/A//------------------------------Ideal------------------------------------------ 0N/A// This method is invoked recursively on chains of MergeMem nodes 0N/A // Remove chain'd MergeMems 0N/A // This is delicate, because the each "in(i)" (i >= Raw) is interpreted 0N/A // relative to the "in(Bot)". Since we are patching both at the same time, 0N/A // we have to be careful to read each "in(i)" relative to the old "in(Bot)", 0N/A // but rewrite each "in(i)" relative to the new "in(Bot)". 0N/A // simplify stacked MergeMems in base memory 0N/A // the base memory might contribute new slices beyond my req() 0N/A // Look carefully at the base node if it is a phi. 0N/A // do not examine phi if degraded to a copy 0N/A // see if the phi is unfinished 0N/A // incomplete phi; do not look at it yet! 0N/A // Note: We do not call verify_sparse on entry, because inputs 0N/A // can normalize to the base_memory via subsume_node or similar 0N/A // mechanisms. This method repairs that damage. 0N/A // Look at each slice. 0N/A // calculate the old memory value 0N/A // maybe update (reslice) the old memory value 0N/A // simplify stacked MergeMems 0N/A // This can happen if loops break up and safepoints disappear. 0N/A // A merge of BotPtr (default) with a RawPtr memory derived from a 0N/A // safepoint can be rewritten to a merge of the same BotPtr with 0N/A // the BotPtr phi coming into the loop. If that phi disappears 0N/A // also, we can end up with a self-loop of the mergemem. 0N/A // In general, if loops degenerate and memory effects disappear, 0N/A // a mergemem can be left looking at itself. This simply means 0N/A // that the mergemem's default should be used, since there is 0N/A // no longer any apparent effect on this slice. 0N/A // Note: If a memory slice is a MergeMem cycle, it is unreachable 0N/A // from start. Update the input to TOP. 605N/A // else preceding memory was not a MergeMem 0N/A // replace equivalent phis (unfortunately, they do not GVN together) 0N/A // equivalent phi nodes; revert to the def 0N/A // maybe store down a new value 0N/A // Warning: Do not combine this "if" with the previous "if" 0N/A // A memory slice might have be be rewritten even if it is semantically 0N/A // unchanged, if the base_memory value has changed. 0N/A // Don't use set_base_memory(new_base), because we need to update du. 0N/A // a self cycle indicates this memory path is dead 0N/A // Resolve external cycles by calling Ideal on a MergeMem base_memory 0N/A // Recursion must occur after the self cycle check above 0N/A // propagate rollup of dead cycle to self 0N/A // Cut inputs during Parse phase only. 0N/A // During Optimize phase a dead MergeMem node will be subsumed by Top. 0N/A // Check if PhiNode::Ideal's "Split phis through memory merges" 0N/A // transform should be attempted. Look for this->phi->this cycle. 0N/A//-------------------------set_base_memory------------------------------------- 0N/A // Clear out other occurrences of new_base: 0N/A//------------------------------out_RegMask------------------------------------ 0N/A//------------------------------dump_spec-------------------------------------- 0N/A if (a == b)
return true;
0N/A // phis shift around during optimization 0N/A return true;
// pretty stupid... 0N/A// verify a narrow slice (either incoming or outgoing) 0N/A // Elide intervening MergeMem's 0N/A // Implicit copy of base_memory() 0N/A // A few places like make_runtime_call "know" that VM calls are narrow, 0N/A // and can be used to update only the VM bits stored as TypeRawPtr::BOTTOM. 0N/A // memory can "leak through" calls on channels that 0N/A // are write-once. Allow this also. 0N/A//-----------------------------memory_at--------------------------------------- 0N/A "must avoid base_memory and AliasIdxTop");
0N/A // Otherwise, it is a narrow slice. 0N/A // the array is sparse; empty slots are the "top" node 0N/A "must be a wide memory");
0N/A // AliasLevel == 0 if we are organizing the memory states manually. 0N/A // See verify_memory_slice for comments on TypeRawPtr::BOTTOM. 0N/A // make sure the stored slice is sane 0N/A // Give it a pass: It is a mostly harmless repetition of the base. 0N/A // This can arise normally from node subsumption during optimization. 0N/A//---------------------------set_memory_at------------------------------------- 0N/A if (n ==
empty_mem)
return;
// already the default, so do not grow me 0N/A // grow the sparse array 0N/A//--------------------------iteration_setup------------------------------------ 0N/A // invariant: the finite support of mm2 is within mm->req() 0N/A // Replace spurious copies of base_memory by top. 0N/A//---------------------------grow_to_match------------------------------------- 0N/A // look for the finite support of the other memory 0N/A//---------------------------verify_sparse------------------------------------- 0N/A // The following can happen in degenerate cases, since empty==top. 0N/A if (
in(i) ==
base_mem)
return false;
// should have been the sentinel value! 0N/A if (
mem == n)
return true;
// might be empty_memory()