escape.cpp revision 2956
2362N/A * Copyright (c) 2005, 2011, Oracle and/or its affiliates. All rights reserved. 1088N/A * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 1088N/A * This code is free software; you can redistribute it and/or modify it 1088N/A * under the terms of the GNU General Public License version 2 only, as 1088N/A * published by the Free Software Foundation. 1088N/A * This code is distributed in the hope that it will be useful, but WITHOUT 1088N/A * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 1088N/A * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 1088N/A * version 2 for more details (a copy is included in the LICENSE file that 1088N/A * You should have received a copy of the GNU General Public License version 1088N/A * 2 along with this work; if not, write to the Free Software Foundation, 1088N/A * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 2362N/A * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 1088N/A // Add ConP(#NULL) and ConN(#NULL) nodes. 1088N/A // don't add a self-referential edge, this can occur during removal of 1088N/A // We are computing a raw address for a store captured by an Initialize 1088N/A // compute an appropriate address type. AddP cases #3 and #5 (see below). 1088N/A "offset must be a constant or it is initialization of array");
1088N/A // Don't change non-escaping state of NULL pointer. // inline set_escape_state(idx, es); // If we are still collecting or there were no non-escaping allocations // we don't know the answer yet // if the node was created after the escape computation, return // if we have already computed a value, return it // PointsTo() calls n->uncast() which can return a new ideal node. // compute max escape state of anything this node could point to // cache the computed escape state }
// orig_es could be PointsToNode::UnknownEscape // If we have a JavaObject, return just that object // ensure that all inputs of a Phi have been processed assert(
false,
"neither PointsToEdge or DeferredEdge");
// no deferred or pointsto edges found. Assume the value was set // outside this method. Add the phantom object to the pointsto set. // This method is most expensive during ConnectionGraph construction. // Reuse vectorSet and an additional growable array for deferred edges. // No deferred or pointsto edges found. Assume the value was set // outside this method. Add edge to phantom object. // Mark current edges as visited and move deferred edges to separate array. // No deferred or pointsto edges found. Assume the value was set // outside this method. Add edge to phantom object. // Special case - field set outside (globally escaping). assert(
false,
"invalid connection graph");
// Add an edge to node given by "to_i" from any field of adr_i whose offset // matches "offset" A deferred edge is added if to_i is a LocalVar, and // a pointsto edge is added if it is a JavaObject // Add a deferred edge from node given by "from_i" to any field of adr_i // whose offset matches "offset". // Assume the field was set outside this method if it is not Allocation // Some fields references (AddP) may still be missing // until Connection Graph construction is complete. // For example, loads from RAW pointers with offset 0 // which don't have AddP. // A reference to phantom_object will be added if // a field reference is still missing after completing // Connection Graph (see remove_deferred()). // AddP cases for Base and Address inputs: // case #1. Direct object's field reference: // Proj #5 ( oop result ) // CheckCastPP (cast to instance type) // AddP ( base == address ) // case #2. Indirect object's field reference: // CastPP (cast to instance type) // AddP ( base == address ) // case #3. Raw object's field reference for Initialize node: // Proj #5 ( oop result ) // case #4. Array's element reference: // {CheckCastPP | CastPP} // | AddP ( array's element offset ) // AddP ( array's offset ) // case #5. Raw object's field reference for arraycopy stub call: // The inline_native_clone() case when the arraycopy stub is called // after the allocation before Initialize and CheckCastPP nodes. // Proj #5 ( oop result ) // AddP ( base == address ) // case #6. Constant Pool, ThreadLocal, CastX2P or // Raw object's field reference: // {ConP, ThreadLocal, CastX2P, raw Load} // case #7. Klass's field reference. // AddP ( base == address ) // case #8. narrow Klass's field reference. // AddP ( base == address ) // Case #6 (unsafe access) may have several chained AddP nodes. // Find array's offset to push it on worklist first and // as result process an array's element offset first (pushed second) // to avoid CastPP for the array's offset. // Otherwise the inserted CastPP (LocalVar) will point to what // the AddP (Field) points to. Which would be wrong since // the algorithm expects the CastPP has the same point as // as AddP's base CheckCastPP (LocalVar). // memProj (from ArrayAllocation CheckCastPP) // | || Int (element index) // | || | ConI (log(element size)) // | AddP (array's element offset) // | | ConI (array's offset: #12(32-bits) or #24(64-bits)) // Load/Store (memory operation on array's element) // Adjust the type and inputs of an AddP which computes the // address of a field of an instance // We are computing a raw address for a store captured by an Initialize // compute an appropriate address type (cases #3 and #5). "old type must be non-instance or match new type");
// The type 't' could be subclass of 'base_t'. // As result t->offset() could be large then base_t's size and it will // cause the failure in add_offset() with narrow oops since TypeOopPtr() // constructor verifies correctness of the offset. // It could happened on subclass's branch (from the type profiling // inlining) which was not eliminated during parsing since the exactness // of the allocation type was not propagated to the subclass type check. // Or the type 't' could be not related to 'base_t' at all. // It could happened when CHA type is different from MDO type on a dead path // (for example, from instanceof check) which is not collapsed during parsing. // Do nothing for such AddP node and don't process its users since // this code branch will go away. return false;
// bail out // Do NOT remove the next line: ensure a new alias index is allocated // for the instance type. Note: C++ will not remove it since the call // record the allocation in the node map // Set addp's Base and Address to 'base'. // Skip AddP cases #3 and #5. // AddP case #4 (adr is array's element offset AddP node) // Put on IGVN worklist since at least addp's type was changed above. // Create a new version of orig_phi if necessary. Returns either the newly // created phi or an existing phi. Sets create_new to indicate whether a new // phi was created. Cache the last newly created phi in the node map. // nothing to do if orig_phi is bottom memory or matches alias_idx // Have we recently created a Phi for this alias index? // Previous check may fail when the same wide memory Phi was split into Phis // for different memory slices. Search all Phis for this region. // Retry compilation without escape analysis. // If this is the first failure, the sentinel string will "stick" // to the Compile object, and the C2Compiler will see it and retry. // Return a new version of Memory Phi "orig_phi" with the inputs having the // specified alias index. // found an phi for which we created a new split, push current one on worklist and begin // verify that the new Phi has an input for each input of the original // Check if all new phi's inputs have specified alias index. // Otherwise use old phi. // we have finished processing a Phi, see if there are any more to do // The next methods are derived from methods in MemNode. // TypeOopPtr::NOTNULL+any is an OOP with unknown offset - generally // means an array I have not precisely typed yet. Do not do any // alias stuff with it any time soon. // Update input if it is progress over what we have now // Move memory users to their memory slices. continue;
// Nothing to do // Replace previous general reference to mem node. // Don't move related membars. continue;
// Nothing to do // Move to general memory slice. // Don't move related cardmark. // Memory nodes should have new memory input. "Following memory nodes should have new memory input or be on the same memory slice");
// Phi nodes should be split and moved already. assert(
false,
"should not be here");
// Search memory chain of "mem" to find a MemNode whose address // is the specified alias index. break;
// hit one of our sentinels break;
// Do not skip store to general memory slice. continue;
// don't search further for non-instance types // skip over a call which does not affect this memory slice break;
// hit one of our sentinels // Stop if this is the initialization for the object instance which // which contains this memory slice, otherwise skip over it. // Didn't find instance memory, search through general slice recursively. // Can not bypass initialization of the instance // Otherwise skip it (the call updated 'result' value). assert(
idx !=
alias_idx,
"Object is not scalar replaceable if a LoadStore node access its field");
// Push all non-instance Phis on the orig_phis worklist to update inputs // during Phase 4 if needed. // Create a new Phi with the specified alias index type. // the result is either MemNode, PhiNode, InitializeNode. // Convert the types of unescaped object to instance types where possible, // propagate the new type information through the graph, and update memory // edges and MergeMem inputs to reflect the new type. // We start with allocations (and calls which may be allocations) on alloc_worklist. // The processing is done in 4 phases: // Phase 1: Process possible allocations from alloc_worklist. Create instance // types for the CheckCastPP for allocations where possible. // Propagate the the new types through users as follows: // casts and Phi: push users on alloc_worklist // AddP: cast Base and Address inputs to the instance type // push any AddP users on alloc_worklist and push any memnode // users onto memnode_worklist. // Phase 2: Process MemNode's from memnode_worklist. compute new address type and // search the Memory chain for a store with the appropriate type // address type. If a Phi is found, create a new version with // the appropriate memory slices from each of the Phi inputs. // For stores, process the users as follows: // MemNode: push on memnode_worklist // MergeMem: push on mergemem_worklist // Phase 3: Process MergeMem nodes from mergemem_worklist. Walk each memory slice // moving the first node encountered of each instance type to the // the input corresponding to its alias index. // appropriate memory slice. // Phase 4: Update the inputs of non-instance memory Phis and the Memory input of memnodes. // In the following example, the CheckCastPP nodes are the cast of allocation // results and the allocation of node 29 is unescaped and eligible to be an // 20 AddP _ 19 19 10 Foo+12 alias_index=4 // 30 AddP _ 29 29 10 Foo+12 alias_index=4 // 40 StoreP 25 7 20 ... alias_index=4 // 50 StoreP 35 40 30 ... alias_index=4 // 60 StoreP 45 50 20 ... alias_index=4 // 70 LoadP _ 60 30 ... alias_index=4 // 80 Phi 75 50 60 Memory alias_index=4 // 90 LoadP _ 80 30 ... alias_index=4 // 100 LoadP _ 80 20 ... alias_index=4 // Phase 1 creates an instance type for node 29 assigning it an instance id of 24 // and creating a new alias index for node 30. This gives: // 20 AddP _ 19 19 10 Foo+12 alias_index=4 // 29 CheckCastPP "Foo" iid=24 // 30 AddP _ 29 29 10 Foo+12 alias_index=6 iid=24 // 40 StoreP 25 7 20 ... alias_index=4 // 50 StoreP 35 40 30 ... alias_index=6 // 60 StoreP 45 50 20 ... alias_index=4 // 70 LoadP _ 60 30 ... alias_index=6 // 80 Phi 75 50 60 Memory alias_index=4 // 90 LoadP _ 80 30 ... alias_index=6 // 100 LoadP _ 80 20 ... alias_index=4 // In phase 2, new memory inputs are computed for the loads and stores, // And a new version of the phi is created. In phase 4, the inputs to // node 80 are updated and then the memory nodes are updated with the // values computed in phase 2. This results in: // 20 AddP _ 19 19 10 Foo+12 alias_index=4 // 29 CheckCastPP "Foo" iid=24 // 30 AddP _ 29 29 10 Foo+12 alias_index=6 iid=24 // 40 StoreP 25 7 20 ... alias_index=4 // 50 StoreP 35 7 30 ... alias_index=6 // 60 StoreP 45 40 20 ... alias_index=4 // 70 LoadP _ 50 30 ... alias_index=6 // 80 Phi 75 40 60 Memory alias_index=4 // 120 Phi 75 50 50 Memory alias_index=6 // 90 LoadP _ 120 30 ... alias_index=6 // 100 LoadP _ 80 20 ... alias_index=4 // Phase 1: Process possible allocations from alloc_worklist. // Create instance types for the CheckCastPP for allocations where possible. // (Note: don't forget to change the order of the second AddP node on // the alloc_worklist if the order of the worklist processing is changed, // see the comment in find_second_addp().) // copy escape information to call node // We have an allocation or call which returns a Java object, // see if it is unescaped. // Find CheckCastPP for the allocate or for the return value of a call if (n ==
NULL) {
// No uses except Initialize node // Set the scalar_replaceable flag for allocation // so it could be eliminated if it has no uses. // The inline code for Object.clone() casts the allocation result to // java.lang.Object and then to the actual type of the allocated // object. Detect this case and use the second cast. // Also detect j.l.reflect.Array.newInstance(jobject, jint) case when // the allocation result is cast to java.lang.Object and then // to the actual Array type. // Non-scalar replaceable if the allocation type is unknown statically // (reflection allocation), the object can't be restored during // deoptimization without precise type. // Set the scalar_replaceable flag for allocation // so it could be eliminated. // in order for an object to be scalar-replaceable, it must be: // - a direct allocation (not a call returning an object) // - eligible to be a unique type // - not determined to be ineligible by escape analysis continue;
// not a TypeOopPtr // First, put on the worklist all Field edges from Connection Graph // which is more accurate then putting immediate users from Ideal Graph. "only AddP nodes are Field edges in CG");
if (
use->
outcnt() > 0) {
// Don't process dead nodes // An allocation may have an Initialize which has raw stores. Scan // the users of the raw allocation result and push AddP users assert(
false,
"escaped allocation");
continue;
// Assume the value was set outside this method. continue;
// already processed assert(
false,
"escaped allocation");
continue;
// Assume the value was set outside this method. continue;
// Skip dead path with different type assert(
false,
"EA: unexpected node");
// push allocation's users on appropriate worklist // Look for MergeMem nodes for calls which reference unique allocation // (through CheckCastPP nodes) even for debug info. assert(
false,
"EA: missing allocation reference path");
// New alias types were created in split_AddP(). // Phase 2: Process MemNode's from memnode_worklist. compute new address type and // compute new values for Memory inputs (the Memory inputs are not // actually updated until phase 4.) // we don't need to do anything, but the users must be pushed }
else if (n->
is_MemBar()) {
// Initialize, MemBar nodes // we don't need to do anything, but the users must be pushed // We delay the memory edge update since we need old one in // MergeMem code below when instances memory slices are separated. continue;
// don't push users // get the memory projection // push user on appropriate worklist assert(
false,
"EA: missing memory path");
// Phase 3: Process MergeMem nodes from mergemem_worklist. // Walk each memory slice moving the first node encountered of each // instance type to the the input corresponding to its alias index. // Note: we don't want to use MergeMemStream here because we only want to // scan inputs which exist at the start, not ones we add during processing. // Note 2: MergeMem may already contains instance memory slices added // during find_inst_mem() call when memory nodes were processed above. // First, update mergemem by moving memory nodes to corresponding slices // if their type became more precise since this mergemem was created. // Find any instance of the current type if we haven't encountered // already a memory slice of the instance along the memory chain. // Find the rest of instances values // Didn't find instance memory, search through general slice recursively. // Phase 4: Update the inputs of non-instance memory Phis and // the Memory input of memnodes // First update the inputs of any non-instance Phi's from // which we split out an instance Phi. Note we don't have // to recursively process Phi's encounted on the input memory // chains as is done in split_memory_phi() since they will // also be processed here. // Update the memory inputs of MemNodes with the value we computed // in Phase 2 and move stores memory users to corresponding memory slices. // Disable memory split verification code until the fix for 6984348. // Currently it produces false negative results since it does not cover all cases. // Move memory users of a store first. // Now update memory input // Verify that memory was split correctly // EA brings benefits only when the code has allocations and/or locks which // are represented by ideal Macro nodes. for(
int i=0; i <
cnt; i++ ) {
// Add ConP#NULL and ConN#NULL nodes before ConnectionGraph construction // to create space for them in ConnectionGraph::_nodes[]. // Perform escape analysis // There are non escaping objects. // 1. Populate Connection Graph (CG) with Ideal nodes. // Push all useful nodes onto CG list and set their type. // Only allocations and java static calls results are checked // for an escape status. See process_call_result() below. // Collect address nodes. Use them during stage 3 below // to build initial connection graph field edges. // Collect all MergeMem nodes to add memory slices for // scalar replaceable objects in split_unique_types(). // Compare pointers nodes return false;
// Nothing to do. // 2. First pass to create simple CG edges (doesn't require to walk CG). // 3. Pass to create initial fields edges (JavaObject -F-> AddP) // to reduce number of iterations during stage 4 below. // 4. Build Connection Graph which need // to walk the connection graph. if (n !=
NULL) {
// Call, AddP, LoadP, StoreP // After IGVN user nodes may have smaller _idx than // their inputs so they will be processed first in // previous loop. Because of that not all Graph // edges will be created. Walk over interesting // nodes again until no new edges are created. // Normally only 1-3 passes needed to build // Connection Graph depending on graph complexity. // Observed 8 passes in jvm2008 compiler.compiler. // Set limit to 20 to catch situation when something // did go wrong and recompile the method without EA. err_msg(
"infinite EA connection graph build with %d nodes and worklist size %d",
// Possible infinite build_connection_graph loop, // retry compilation without escape analysis. // 5. Find fields initializing values for not escaped allocations // 6. Remove deferred edges from the graph. // 7. Adjust escape state of nonescaping objects. // 8. Propagate escape states. // mark all nodes reachable from GlobalEscape nodes // mark all nodes reachable from ArgEscape nodes // push all NoEscape nodes on the worklist // mark all nodes reachable from NoEscape nodes // Push scalar replaceable allocations on alloc_worklist // for processing in split_unique_types(). Note, // following code may change scalar_replaceable value. // Propagate scalar_replaceable value. // Mark locks before changing ideal graph. for(
int i=0; i <
cnt; i++ ) {
// Add ConI(#CC_GT) and ConI(#CC_EQ). // Optimize objects compare. dump();
// Dump ConnectionGraph // Now use the escape information to create unique types for // scalar replaceable objects. tty->
print(
"=== No allocations eliminated for ");
tty->
print(
" since EliminateAllocations is off ===");
tty->
print(
" since there are no scalar replaceable candidates ===");
tty->
print(
" since AliasLevel < 3 ===");
// Find fields initializing values for allocations. // Check if a oop field's initializing value is recorded and add // a corresponding NULL field's value if it is not recorded. // Connection Graph does not record a default initialization by NULL // captured by Initialize node. // Check only oop fields. // Ignore non field load (for example, klass load) // Ignore array length load // Raw pointers are used for initializing stores so skip it // since it should be recorded already // Check for a store which follows allocation without branches. // For example, a volatile field store is not collected // by Initialize node. TODO: it would be nice to use idom() here. // Search all references to the same field which use different // AddP nodes, for example, in the next case: // Point p[] = new Point[1]; // if ( x ) { p[0] = new Point(); p[0].x = x; } // if ( p[0] != null ) { y = p[0].x; } // has CastPP // A field's initializing value was not recorded. Add NULL. // Adjust escape state after Connection Graph is built. // Search for objects which are not scalar replaceable // and mark them to propagate the state to referenced objects. // An object is not scalar replaceable if the field which may point // to it has unknown offset (unknown element of an array of objects). // Currently an object is not scalar replaceable if a LoadStore node // access its field since the field value is unknown after it. // An object is not scalar replaceable if the address points // to unknown field (unknown element for arrays, offset is OffsetBot). // Or the address may point to more then one object. This may produce // the false positive result (set not scalar replaceable) // since the flow-insensitive escape analysis can't separate // the case when stores overwrite the field's value from the case // when stores happened on different control branches. // Note: it will disable scalar replacement in some cases: // Point p[] = new Point[1]; // p[0] = new Point(); // Will be not scalar replaced // but it will save us from incorrect optimizations in next cases: // Point p[] = new Point[1]; // if ( x ) p[0] = new Point(); // Will be not scalar replaced // Propagate escape states to referenced nodes. // push all nodes with the same escape state on the worklist // mark all reachable nodes // Has not escaping java objects // Optimize objects compare. // Clone returned Set since PointsTo() returns pointer // to the same structure ConnectionGraph.pt_ptset. // Check simple cases first. // Comparing the same not escaping object. // Comparing not escaping allocation. return _pcmp_neq;
// This includes nullness check. // Comparing not escaping allocation. return _pcmp_neq;
// This includes nullness check. return NULL;
// Sets are not disjoint // Check nullness of unknown object. // Disjointness by itself is not sufficient since // alias analysis is not complete for escaped objects. // Disjoint sets are definitely unrelated only when // at least one set has only not escaping objects. assert(
false,
"should be done already");
// Stub calls, objects do not escape but they are not scale replaceable. // Adjust escape state for outgoing arguments. assert(
false,
"EA: unexpected CallLeaf");
// The inline_native_clone() case when the arraycopy stub is called // after the allocation before Initialize and CheckCastPP nodes. // Set AddP's base (Allocate) as not scalar replaceable since // pointer to the base (with offset) is passed as argument. // For a static call, we know exactly what method is being called. // Use bytecode estimator to record the call's escape affects // fall-through if not a Java method or no analyzer information // The argument global escapes, mark everything it could point to // The argument itself doesn't escape, but any fields might //The argument global escapes, mark everything it could point to // The argument itself doesn't escape, but any fields might // Fall-through here if not a Java method or no analyzer information // or some other type of call, assume the worst case: all arguments // adjust escape state for outgoing arguments // Not scalar replaceable if the length is not constant or too big. // For a static call, we know exactly what method is being called. // Use bytecode estimator to record whether the call's return value escapes // Note: we use isa_ptr() instead of isa_oopptr() here because the // _multianewarray functions return a TypeRawPtr. break;
// doesn't return a pointer type // not a Java method, assume global escape // Returns a newly allocated unescaped object, simply // update dependency information. // Mark it as NoEscape so that objects referenced by // it's fields will be marked as NoEscape at least. // determine whether any arguments are returned // Returns unknown object. // Some other type of call, assume the worst case that the // returned value, if any, globally escapes. // Note: we use isa_ptr() instead of isa_oopptr() here because the // _multianewarray functions return a TypeRawPtr. // Populate Connection Graph with Ideal nodes and create simple // connection graph edges (do not need to check the node_type of inputs // or to call PointsTo() to walk the connection graph). return;
// No need to redefine node's state. // Arguments to allocation and locking don't escape. // Put Lock and Unlock nodes on IGVN worklist to process them during // the first IGVN optimization when escape information is still available. // Don't mark as processed since call's arguments have to be processed. // Check if a call returns an object. // Since the called mathod is statically unknown assume // the worst case that the returned value globally escapes. // Using isa_ptr() instead of isa_oopptr() for LoadP and Phi because // ThreadLocal has RawPrt type. {
// "Unsafe" memory access. // assume all pointer constants globally escape except for null // assume all narrow oop constants globally escape except for null // assume that all exception objects globally escape // We have to assume all input parameters globally escape // (Note: passing 'false' since _processed is already set). {
// Produces Null or notNull and is used in CmpP. // nothing to do if not an oop or narrow oop for (i =
1; i < n->
req() ; i++) {
continue;
// ignore top or inputs which go back this node // we are only interested in the oop result projection from a call // The call may not be registered yet (since not all its inputs are registered) // if this is the projection from backbranch edge of Phi. // The call's result may need to be processed later if the call // returns it's argument and the argument is not processed yet. // Treat Return value as LocalVar with GlobalEscape escape state. // We are computing a raw address for a store captured // by an Initialize compute an appropriate address type. // char[] arrays passed to string intrinsics are not scalar replaceable. // Don't set processed bit for AddP, LoadP, StoreP since // they may need more then one pass to process. // Also don't mark as processed Call nodes since their // arguments may need more then one pass to process. return;
// No need to redefine node's state. // Create a field edge to this node from everything base could point to. assert(
false,
"Op_LoadKlass");
// For everything "adr_base" could point to, create a deferred edge from // this node to each field with the same offset. // Add field edge if it is missing. assert(
false,
"Op_PartialSubtypeCheck");
for (
uint i =
1; i < n->
req() ; i++) {
continue;
// ignore top or inputs which go back this node // we are only interested in the oop result projection from a call "all nodes should be registered");
// For everything "adr_base" could point to, create a deferred edge // to "val" from each field with the same offset. // Add field edge if it is missing. // char[] arrays passed to string intrinsic do not escape but // they are not scalar replaceable. Adjust escape state for them. // Start from in(2) edge since in(1) is memory edge. for (
uint i =
2; i < n->
req(); i++) {
// Mark as ArgEscape everything "adr" could point to. assert(
false,
"Op_ThreadLocal");
// This method should be called only for EA specific nodes. tty->
print(
"======== Connection graph for ");
// Print all locals which reference this allocation // Print all fields which reference this allocation