loopnode.hpp revision 2230
1008N/A * Copyright (c) 1998, 2011, Oracle and/or its affiliates. All rights reserved. 1008N/A * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 1008N/A * This code is free software; you can redistribute it and/or modify it 1008N/A * under the terms of the GNU General Public License version 2 only, as 1008N/A * published by the Free Software Foundation. 1008N/A * This code is distributed in the hope that it will be useful, but WITHOUT 1008N/A * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 1008N/A * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 1008N/A * version 2 for more details (a copy is included in the LICENSE file that 1008N/A * You should have received a copy of the GNU General Public License version 1008N/A * 2 along with this work; if not, write to the Free Software Foundation, 1008N/A * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 1008N/A * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 1008N/A// I D E A L I Z E D L O O P S 1008N/A// Idealized loops are the set of loops I perform more interesting 1008N/A// transformations on, beyond simple hoisting. 1008N/A//------------------------------LoopNode--------------------------------------- 1008N/A// Simple loop header. Fall in path on left, loop-back path on right. 1008N/A // Size is bigger to hold the flags. However, the flags do not change 1008N/A // the semantics so it does not appear in the hash & cmp functions. 1008N/A // Names for flag bitfields 1008N/A//------------------------------Counted Loops---------------------------------- 1008N/A// Counted loops are all trip-counted loops, with exactly 1 trip-counter exit 1008N/A// path (and maybe some other exit paths). The trip-counter exit is always 1008N/A// last in the loop. The trip-counter have to stride by a constant; 1008N/A// the exit value is also loop invariant. 1008N/A// CountedLoopNodes and CountedLoopEndNodes come in matched pairs. The 1008N/A// CountedLoopNode has the incoming loop control and the loop-back-control 1008N/A// which is always the IfTrue before the matching CountedLoopEndNode. The 1008N/A// CountedLoopEndNode has an incoming control (possibly not the 1008N/A// CountedLoopNode if there is control flow in the loop), the post-increment 1008N/A// trip-counter value, and the limit. The trip-counter value is always of 1008N/A// the form (Op old-trip-counter stride). The old-trip-counter is produced 1008N/A// by a Phi connected to the CountedLoopNode. The stride is constant. 1008N/A// The Op is any commutable opcode, including Add, Mul, Xor. The 1008N/A// CountedLoopEndNode also takes in the loop-invariant limit value. 1008N/A// From a CountedLoopNode I can reach the matching CountedLoopEndNode via the 1008N/A// loop-back control. From CountedLoopEndNodes I can reach CountedLoopNodes 1008N/A// via the old-trip-counter from the Op node. 1008N/A//------------------------------CountedLoopNode-------------------------------- 1008N/A// CountedLoopNodes head simple counted loops. CountedLoopNodes have as 1008N/A// inputs the incoming loop-start control and the loop-back control, so they 1008N/A// act like RegionNodes. They also take in the initial trip counter, the 1008N/A// loop-invariant stride and the loop-invariant limit value. CountedLoopNodes 1008N/A// produce a loop-body control and the trip counter value. Since 1008N/A// CountedLoopNodes behave like RegionNodes I still have a standard CFG model. 1008N/A // Size is bigger to hold _main_idx. However, _main_idx does not change 1008N/A // the semantics so it does not appear in the hash & cmp functions. 1008N/A // For Pre- and Post-loops during debugging ONLY, this holds the index of 1008N/A // the Main CountedLoop. Used to assert that we understand the graph shape. 1008N/A // Known trip count calculated by policy_maximally_unroll 1008N/A // Expected trip count from profile data 1008N/A // Log2 of original loop bodies in unrolled loop 1008N/A // Node count prior to last unrolling - used to decide if 1008N/A // unroll,optimize,unroll,optimize,... is making progress 1008N/A // Initialize _trip_count to the largest possible value. 1008N/A // Will be reset (lower) if the loop's trip count is known. 1008N/A // Match increment with optional truncation 1008N/A // A 'main' loop has a pre-loop and a post-loop. The 'main' loop 1008N/A // can run short a few iterations and may start a few iterations in. 1008N/A // It will be RCE'd and unrolled and aligned. 1008N/A // A following 'post' loop will run any remaining iterations. Used 1008N/A // during Range Check Elimination, the 'post' loop will do any final 1008N/A // iterations with full checks. Also used by Loop Unrolling, where 1008N/A // the 'post' loop will do any epilog iterations needed. Basically, 1008N/A // a 'post' loop can not profitably be further unrolled or RCE'd. 1008N/A // A preceding 'pre' loop will run at least 1 iteration (to do peeling), 1008N/A // it may do under-flow checks for RCE and may do alignment iterations 1008N/A // so the following main loop 'knows' that it is striding down cache 1008N/A // A 'main' loop that is ONLY unrolled or peeled, never RCE'd or 1008N/A // Aligned, may be missing it's pre-loop. 1008N/A//------------------------------CountedLoopEndNode----------------------------- 1008N/A// CountedLoopEndNodes end simple trip counted loops. They act much like 1008N/A// -----------------------------IdealLoopTree---------------------------------- 1008N/A // The head-tail backedge defines the loop. 1008N/A // If tail is NULL then this loop has multiple backedges as part of the 1008N/A // same loop. During cleanup I'll peel off the multiple backedges; merge 1008N/A // them at the loop bottom and flow 1 real backedge into the loop. 1008N/A // Is 'l' a member of 'this'? 1008N/A // Set loop nesting depth. Accumulate has_call bits. 1008N/A // Split out multiple fall-in edges from the loop header. Move them to a 1008N/A // private RegionNode before the loop. This becomes the loop landing pad. 1008N/A // Split out the outermost loop from this shared header. 1008N/A // Merge all the backedges from the shared header into a private Region. 1008N/A // Feed that region as the one backedge to this loop. 1008N/A // Split shared headers and insert loop landing pads. 1008N/A // Insert a LoopNode to replace the RegionNode. 1008N/A // Returns TRUE if loop tree is structurally changed. 1008N/A // Perform optimization to use the loop predicates for null checks and range checks. 1008N/A // Applies to any loop level (not just the innermost one) 1008N/A // Perform iteration-splitting on inner loops. Split iterations to 1008N/A // avoid range checks or one-shot null checks. Returns false if the 1008N/A // current round of loop opts should stop. 1008N/A // Driver for various flavors of iteration splitting. Returns false 1008N/A // if the current round of loop opts should stop. 1008N/A // Given dominators, try to find loops with calls that must always be 1008N/A // executed (call dominates loop tail). These loops do not need non-call 1008N/A // Allpaths backwards scan from loop tail, terminating each path at first safepoint 1008N/A // Convert to counted loops where possible 1008N/A // Check for Node being a loop-breaking test 1008N/A // Returns true if ctrl is executed on every complete iteration 1008N/A // Remove simplistic dead code from loop body 1008N/A // Look for loop-exit tests with my 50/50 guesses from the Parsing stage. 1008N/A // Replace with a 1-in-10 exit guess. 1008N/A // Return TRUE or FALSE if the loop should never be RCE'd or aligned. 1008N/A // Useful for unrolling loops with NO array accesses. 1008N/A // Return TRUE or FALSE if the loop should be unswitched -- clone 1008N/A // loop with an invariant test 1008N/A // Micro-benchmark spamming. Remove empty loops. 1008N/A // Return TRUE or FALSE if the loop should be peeled or not. Peel if we can 1008N/A // make some loop-invariant test (usually a null-check) happen before the 1389N/A // Return TRUE or FALSE if the loop should be maximally unrolled. Stash any 1389N/A // known trip count in the counted loop node. 1008N/A // Return TRUE or FALSE if the loop should be unrolled or not. Unroll if 1008N/A // the loop is a CountedLoop and the body is small enough. 1008N/A // Return TRUE or FALSE if the loop should be range-check-eliminated. 1008N/A // Gather a list of IF tests that are dominated by iteration splitting; 1008N/A // also gather the end of the first split and the start of the 2nd split. 1008N/A // Return TRUE or FALSE if the loop should be cache-line aligned. 1008N/A // Gather the expression that does the alignment. Note that only 1008N/A // one array base can be aligned in a loop (unless the VM guarantees 1008N/A // mutual alignment). Note that if we vectorize short memory ops 1008N/A // into longer memory ops, we may want to increase alignment. 1008N/A // Return TRUE if "iff" is a range check. 1008N/A // Compute loop trip count from profile data 1008N/A // Reassociate invariant expressions. 1008N/A // Reassociate invariant add and subtract expressions. 1008N/A // Return nonzero index of invariant operand if invariant and variant 1008N/A // are combined with an Add or Sub. Helper for reassociate_invariants. 1008N/A // Return true if n is invariant 1008N/A // Put loop body on igvn work list 1008N/A void dump()
const;
// Dump this loop recursively 1008N/A// -----------------------------PhaseIdealLoop--------------------------------- 1008N/A// Computes the mapping from Nodes to IdealLoopTrees. Organizes IdealLoopTrees into a 1008N/A// loop tree. Drives the loop-based transformations on the ideal graph. 1008N/A // Pre-computed def-use info 1008N/A // Array of pre-order numbers, plus post-visited bit. 1008N/A // ZERO for not pre-visited. EVEN for pre-visited but not post-visited. 1008N/A // ODD for post-visited. Other bits are the pre-order number. 1008N/A // Allocate _preorders[] array 1008N/A // Allocate _preorders[] array 1008N/A // Check to grow _preorders[] array for the case when build_loop_tree_impl() 1008N/A // Check for pre-visited. Zero for NOT visited; non-zero for visited. 1008N/A // Pre-order numbers are written to the Nodes array as low-bit-set values. 1008N/A // Return pre-order number. 1008N/A // Check for being post-visited. 1008N/A // Should be previsited already (checked with assert(is_visited(n))). 1008N/A // Set/get control node out. Set lower bit to distinguish from IdealLoopTree 1008N/A // Returns true if "n" is a data node, false if it's a control node. 1008N/A // clear out dead code after build_loop_late 1008N/A // Support for faster execution of get_late_ctrl()/dom_lca() 1008N/A // when a node has many uses and dominator depth is deep. 1008N/A // Helper for debugging bad dominance relationships 1008N/A // Inline wrapper for frequent cases: 1008N/A // 2) a use is the same as the current LCA passed as 'n1' 1008N/A // Helper function for directing control inputs away from CFG split 1008N/A // check if transform created new nodes that need _ctrl recorded 1008N/A // Set control and update loop membership 1418N/A // Control nodes can be replaced or subsumed. During this pass they 1008N/A // get their replacement Node in slot 1. Instead of updating the block 1008N/A // location of all Nodes in the subsumed block, we lazily do it. As we 1008N/A // pull such a subsumed block out of the array, we write back the final 1418N/A // true if CFG node d dominates CFG node n 1008N/A // return get_ctrl for a data node and self(n) for a CFG node 1008N/A // Check for loop being set 1008N/A // "n" must be a control node. Returns true if "n" is known to be in a loop. 1008N/A // Lazy-dazy update of 'get_ctrl' and 'idom_at' mechanisms. Replace 1008N/A // the 'old_node' with 'new_node'. Kill old-node. Add a reference 1008N/A // from old_node to new_node to support the lazy update. Reference 1008N/A // replaces loop reference, since that is not needed for dead node. 1008N/A //old_node->set_req( 1, new_node /*NO DU INFO*/ ); 1008N/A // Nodes always have DU info now, so re-use the side array slot 1008N/A // for this node to provide the forwarding pointer. 1418N/A // Place 'n' in some loop nest, where 'n' is a CFG node 1418N/A // Insert loop into the existing loop tree. 'innermost' is a leaf of the 1008N/A // loop tree, not the root. 1418N/A // Place Data nodes in some loop nest 1418N/A // Array of immediate dominance info for each CFG node indexed by node idx 1008N/A // Locally compute IDOM using dom_lca call 1008N/A // Is safept not required by an outer loop? 1008N/A // Replace parallel induction variable (parallel to trip counter) 1008N/A // Perform verification that the graph is valid. 1008N/A // build the loop tree and perform any requested optimizations 1008N/A // Dominators for the sea of nodes 1008N/A // Compute the Ideal Node to Loop mapping 1008N/A // Verify that verify_me made the same decisions as a fresh run. 1008N/A // Build and verify the loop tree without modifying the graph. This 1008N/A // is useful to verify that all inputs properly dominate their uses. 1008N/A // True if the method has at least 1 irreducible loop 1008N/A // Return a post-walked LoopNode 1008N/A // Dead nodes have no loop, so return the top level loop instead 1008N/A // Is 'n' a (nested) member of 'loop'? 1008N/A // This is the basic building block of the loop optimizations. It clones an 1008N/A // entire loop body. It makes an old_new loop body mapping; with this 1008N/A // mapping you can find the new-loop equivalent to an old-loop node. All 1008N/A // new-loop nodes are exactly equal to their old-loop counterparts, all 1008N/A // edges are the same. All exits from the old-loop now have a RegionNode 1008N/A // that merges the equivalent new-loop path. This is true even for the 1008N/A // normal "loop-exit" condition. All uses of loop-invariant old-loop values 1008N/A // now come from (one or more) Phis that merge their new-loop equivalents. 1008N/A // Parameter side_by_side_idom: 1008N/A // When side_by_size_idom is NULL, the dominator tree is constructed for 1008N/A // the clone loop to dominate the original. Used in construction of 1008N/A // pre-main-post loop sequence. 1008N/A // When nonnull, the clone and original are side-by-side, both are 1008N/A // dominated by the passed in side_by_side_idom node. Used in 1008N/A // construction of unswitched loops. 1008N/A // If we got the effect of peeling, either by actually peeling or by 1008N/A // making a pre-loop which must execute at least once, we can remove 1008N/A // all loop-invariant dominated tests in the main body. 1008N/A // Generate code to do a loop peel for the given loop (and body). 1008N/A // old_new is a temp array. 1008N/A // Add pre and post loops around the given loop. These loops are used 1008N/A // during RCE, unrolling and aligning loops. 1008N/A // If Node n lives in the back_ctrl block, we clone a private version of n 1008N/A // in preheader_ctrl block and return that, otherwise return n. 1008N/A // Take steps to maximally unroll the loop. Peel any odd iterations, then 1008N/A // unroll to do double iterations. The next round of major loop transforms 1008N/A // will repeat till the doubled loop body does all remaining iterations in 1 1008N/A // Unroll the loop body one step - make each trip do 2 iterations. 1008N/A // Return true if exp is a constant times an induction var 1008N/A // Return true if exp is a scaled induction var plus (or minus) constant 1008N/A // Return true if proj is for "proj->[region->..]call_uct" 1008N/A // Return true if proj is for "proj->[region->..]call_uct" 1008N/A // Return true for "if(test)-> proj -> ... 1008N/A // other_proj->[region->..]call_uct" 1008N/A // Create a new if above the uncommon_trap_if_pattern for the predicate to be promoted 1008N/A // Find a good location to insert a predicate 1008N/A // Construct a range check for a predicate if 1008N/A // Implementation of the loop predication to promote checks outside the loop 1008N/A // Helper function to collect predicate for eliminating the useless ones 1008N/A // Eliminate range-checks and other trip-counter vs loop-invariant tests. 1008N/A // Create a slow version of the loop by cloning the loop 1008N/A // and inserting an if to select fast-slow versions. 1008N/A // Clone loop with an invariant test (that does not exit) and 1008N/A // insert a clone of the test that selects which version to 1008N/A // Find candidate "if" for unswitching 1008N/A // Range Check Elimination uses this function! 1008N/A // Constrain the main loop iterations so the affine function: 1008N/A // scale_con * I + offset < limit 1008N/A // always holds true. That is, either increase the number of iterations in 1008N/A // the pre-loop or the post-loop until the condition holds true in the main 1008N/A // loop. Scale_con, offset and limit are all loop invariant. 1008N/A // Partially peel loop up through last_peel node. 1008N/A // Create a scheduled list of nodes control dependent on ctrl set. 1008N/A // Has a use in the vector set 1008N/A // Has use internal to the vector set (ie. not in a phi at the loop head) 1008N/A // clone "n" for uses that are outside of loop 1008N/A // clone "n" for special uses that are in the not_peeled region 1008N/A // Insert phi(lp_entry_val, back_edge_val) at use->in(idx) for loop lp if phi does not already exist 1008N/A // Validate the loop partition sets: peel and not_peel 1008N/A // Ensure that uses outside of loop are of the right form 1008N/A // Returns nonzero constant stride if-node is a possible iv test (otherwise returns zero.) 1008N/A // Return the (unique) control output node that's in the loop (if it exists.) 1008N/A // Insert a signed compare loop exit cloned from an unsigned compare. 1008N/A // Utility to register node "n" with PhaseIdealLoop 1008N/A // Utility to create an if-projection 1008N/A // Force the iff control output to be the live_proj 1008N/A // Insert a region before an if projection 1008N/A // Insert a new if before an if projection 1008N/A // Passed in a Phi merging (recursively) some nearly equivalent Bool/Cmps. 1008N/A // "Nearly" because all Nodes have been cloned from the original in the loop, 1008N/A // but the fall-in edges to the Cmp are different. Clone bool/Cmp pairs 1008N/A // through the Phi recursively, and return a Bool. 1008N/A // Rework addressing expressions to get the most loop-invariant stuff 1008N/A // moved out. We'd like to do all associative operators, but it's especially 1008N/A // important (common) to do address expressions. 1008N/A // Reorganize offset computations to lower register pressure. 1008N/A // Mostly prevent loop-fallout uses of the pre-incremented trip counter 1008N/A // (which are then alive with the post-incremented trip counter 1008N/A // forcing an extra register move) 1008N/A // Check for aggressive application of 'split-if' optimization, 1008N/A // using basic block level info. 1008N/A // Mark an IfNode as being dominated by a prior test, 1008N/A // without actually altering the CFG (and hence IDOM info). 1008N/A // Split Node 'n' through merge point 1008N/A // Split Node 'n' through merge point if there is enough win. 1008N/A // Found an If getting its condition-code input from a Phi in the 1008N/A // same block. Split thru the Region. 1008N/A // Return a type based on condition control flow 1008N/A // Helpers for filtered type 1008N/A // Dead nodes have no loop, so return the top level loop instead 1008N/A// Handle lazy update of _tail field 1008N/A //while( !n->in(0) ) // Skip dead CFG nodes 1008N/A// Iterate over the loop tree using a preorder, left-to-right traversal. 1008N/A// Example that visits all counted loops from within PhaseIdealLoop 1008N/A// for (LoopTreeIterator iter(_ltree_root); !iter.done(); iter.next()) { 1008N/A// if (!lpt->is_counted()) continue; 1008N/A#
endif // SHARE_VM_OPTO_LOOPNODE_HPP