2362N/A * Copyright (c) 1997, 2010, Oracle and/or its affiliates. All rights reserved. 0N/A * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 0N/A * This code is free software; you can redistribute it and/or modify it 0N/A * under the terms of the GNU General Public License version 2 only, as 0N/A * published by the Free Software Foundation. 0N/A * This code is distributed in the hope that it will be useful, but WITHOUT 0N/A * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 0N/A * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 0N/A * version 2 for more details (a copy is included in the LICENSE file that 0N/A * accompanied this code). 0N/A * You should have received a copy of the GNU General Public License version 0N/A * 2 along with this work; if not, write to the Free Software Foundation, 0N/A * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 2362N/A * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 0N/A// Optimization - Graph Style 0N/A//----------------------------------------------------------------------------- //============================================================================= pop();
// shrink list by one block push(b);
// grow list by one block //============================================================================= // Check for loop alignment // Pre- and post-loops have low trip count so do not bother with // NOPs for align loop head. The constants are hidden from tuning // but only because my "divide by 4" heuristic surely gets nearly // all possible gain (a "do not align at all" heuristic has a // chance of getting a really tiny gain). // Loops with low backedge frequency should not be aligned. return unit_sz;
// Loop does not loop, more often than not! return unit_sz;
// no particular alignment //----------------------------------------------------------------------------- // Compute the size of first 'inst_cnt' instructions in this block. // Return the number of instructions left to compute if the block has // less then 'inst_cnt' instructions. Stop, and return 0 if sum_size // exceeds OptoLoopAlignment. // Compute size of instructions which fit into fetch buffer only // since all inst_cnt instructions will not fit even if we align them. //----------------------------------------------------------------------------- // Find and remove n from block list //------------------------------is_Empty--------------------------------------- // Return empty status of a block. Empty blocks contain only the head, other // ideal nodes, and an optional trailing goto. // Root or start block is not considered empty // Unreachable blocks are considered empty // Ideal nodes are allowable in empty blocks: skip them Only MachNodes // turn directly into code, because only MachNodes have non-trivial // No room for any interesting instructions? //------------------------------has_uncommon_code------------------------------ // Return true if the block's code implies that it is likely to be // executed infrequently. Check to see if the block ends in a Halt or // a low probability call. // This is true for slow-path stubs like new_{instance,array}, // slow_arraycopy, complete_monitor_locking, uncommon_trap. // The magic number corresponds to the probability of an uncommon_trap, // even though it is a count not a probability. //------------------------------is_uncommon------------------------------------ // True if block is low enough frequency or guarded by a test which // mostly does not go here. // Initial blocks must never be moved, so are never uncommon. // Check for way-low freq // Look for code shape indicating uncommon_trap or slow path // Check to see if this block follows its guard 1 time out of 10000 // See list of magnitude-4 unlikely probabilities in cfgnode.hpp which // we intend to be "uncommon", such as slow-path TLE allocation, // predicted call failure, and uncommon trap triggers. // Use an epsilon value of 5% to allow for variability in frequency // predictions and floating point calculations. The net effect is // that guard_factor is set to 9500. // Ignore low-frequency blocks. // The next check is (guard->_freq < 1.e-5 * 9500.). // The block is uncommon if all preds are uncommon or // it is uncommon for all frequent preds. //------------------------------dump------------------------------------------- // Dump the original block's idx // Print the incoming CFG edges and the outgoing CFG edges // Dump any loop-specific bits, especially for CountedLoops. //============================================================================= //------------------------------PhaseCFG--------------------------------------- // I'll need a few machine-specific GotoNodes. Make an Ideal GotoNode, // then Match it into a machine-specific Node. Then clone the machine // Build the CFG in Reverse Post Order //------------------------------build_cfg-------------------------------------- // Build a proper looking CFG. Make every block begin with either a StartNode // or a RegionNode. Make every block end with either a Goto, If or Return. // The RootNode both starts and ends it's own block. Do this with a recursive // backwards walk over the control edges. // Allocate stack with enough space to avoid frequent realloc uint sum = 0;
// Counter for blocks // node and in's index from stack's top // 'np' is _root (see above) or RegionNode, StartNode: we push on stack // only nodes which point to the start of basic block (see below). // idx > 0, except for the first node (_root) pushed on stack // at the beginning when idx == 0. // We will use the condition (idx == 0) later to end the build. // Does the block end with a proper block-ending Node? One of Return, // If or Goto? (This check should be done for visited nodes also). if (x ==
NULL) {
// Does not end right... // Skip any control-pinned middle'in stuff proj = p;
// Update pointer to last Control p = p->
in(0);
// Move control forward // Make the block begin with one of Region or StartNode. r->
init_req(
1, p);
// Insert RegionNode in the way // 'p' now points to the start of this basic block // Put self in array of basic blocks if( x != p ) {
// Only for root is x == p // Now handle predecessors ++
sum;
// Count 1 for self block for (
int i = (
cnt -
1); i > 0; i-- ) {
// For all predecessors // Check to see if p->in(i) is a "control-dependent" CFG edge - // i.e., it splits at the source (via an IF or SWITCH) and merges // at the destination (via a many-input Region). // This breaks critical edges. The RegionNode to start the block // will be added when <p,i> is pulled off the node stack if (
cnt >
2 ) {
// Merging many things? // Force a block on the control-dependent edge nstack.
push(p, i);
// 'p' is RegionNode or StartNode }
else {
// Post-processing visited nodes // Check if it the fist node pushed on stack at the beginning. if (
idx == 0)
break;
// end of the build // Find predecessor basic block // Insert into nodes array, if not already there // Map basic block of projection // Insert self as a child of my predecessor block "too many control users, not a CFG?" );
// Return number of basic blocks for all children and self //------------------------------insert_goto_at--------------------------------- // Inserts a goto & corresponding basic block between // block[block_no] and its succ_no'th successor block // get block with block_no // get successor block succ_no // Compute frequency of the new block. Do this before inserting // new block in case succ_prob() needs to infer the probability from // get ProjNode corresponding to the succ_no'th successor of the in block // create region for basic block // setup corresponding basic block // add it to the basic block // hook up successor block // remap successor's predecessors if necessary // remap predecessor's successor to new block // Set the frequency of the new block // add new basic block to basic block list //------------------------------no_flip_branch--------------------------------- // Does this block end in a multiway branch that cannot have the default case // flipped for another case? //------------------------------convert_NeverBranch_to_Goto-------------------- // Check for NeverBranch at block end. This needs to become a GOTO to the // true target. NeverBranch are treated as a conditional branch that always // goes the same direction for most of the optimizer and are used to give a // fake exit path to infinite loops. At this late stage they need to turn // into Goto's so that when you enter the infinite loop you indeed hang. // remap successor's predecessors if necessary // Kill alternate exit path // Scan through block, yanking dead path from //------------------------------move_to_next----------------------------------- // Helper function to move block bx to the slot following b_index. Return // true if the move is successful, otherwise false // Return false if bx is already scheduled. // Find the current index of block bx on the block list // If the previous block conditionally falls into bx, return false, // because moving bx will create an extra jump. // Reinsert bx just past block 'b' //------------------------------move_to_end------------------------------------ // Move empty and uncommon blocks to the end. // Remove the goto, but leave the block. // Mark this block as a connector block, which will cause it to be // ignored in certain functions such as non_connector_successor(). // Move the empty block to the end, and don't recheck. //---------------------------set_loop_alignment-------------------------------- // Set loop alignment for every block //-----------------------------remove_empty------------------------------------ // Make empty basic blocks to be "connector" blocks, Move uncommon blocks // Move uncommon blocks to the end // Check for NeverBranch at block end. This needs to become a GOTO to the // true target. NeverBranch are treated as a conditional branch that // always goes the same direction for most of the optimizer and are used // to give a fake exit path to infinite loops. At this late stage they // need to turn into Goto's so that when you enter the infinite loop you // Look for uncommon blocks and move to end. last--;
// No longer check for being uncommon! b =
_blocks[i];
// Find the fall-thru block i--;
// backup block counter post-increment // Move empty blocks to the end }
// End of for all blocks//-----------------------------fixup_flow-------------------------------------- // Fix up the final control flow for basic blocks. // Fixup final control flow for the blocks. Remove jump-to-next // block. If neither arm of a IF follows the conditional branch, we // have to add a second jump after the conditional. We place the // TRUE branch target in succs[0] for both GOTOs and IFs. b->
_pre_order = i;
// turn pre-order into block-index // Connector blocks need no further processing. "All connector blocks should sink to the end");
"Empty blocks should be connectors");
// Check for multi-way branches where I cannot negate the test to // exchange the true and false targets. // Find fall through case - if must fall into its target // successor j2 is fall through case // but it is not the next block => insert a goto // Put taken branch in slot 0 // Flip targets in succs map // We fall into next block; remove the Goto }
else if( b->
_num_succs ==
2 ) {
// Block ends in a If? // Get opcode of 1st projection (matches _succs[0]) // Note: Since this basic block has 2 exits, the last 2 nodes must // be projections (in any order), the 3rd last node must be // the IfNode (we have excluded other 2-way exits such as // Assert that proj0 and succs[0] match up. Similarly for proj1 and succs[1]. // Check for neither successor block following the current // block ending in a conditional. If so, move one of the // successors after the current one, provided that the // successor was previously unscheduled, but moveable // (i.e., all paths to it involve a branch). // Choose the more common successor based on the probability // of the conditional branch. // _prob is the probability of taking the true path. Make // p the probability of taking successor #1. // Prefer successor #1 if p > 0.5 // Attempt the more common successor first // Check for conditional branching the wrong way. Negate // conditional, if needed, so it falls into the following block // and branches to the not-following block. // Check for the next block being in succs[0]. We are going to branch // to succs[0], so we want the fall-thru case as the next block in // Fall-thru case in succs[0], so flip targets in succs map // Flip projection for each target // The existing conditional branch need not change. // Add a unconditional branch to the false target. // Alas, it must appear in its own block and adding a // block this late in the game is complicated. Sigh. // Make sure we TRUE branch to the target b->
_nodes.
pop();
// Remove IfFalse & IfTrue projections // Multi-exit block, e.g. a switch statement // But we don't need to do anything here }
// End of for all blocks//------------------------------dump------------------------------------------- // Do not visit this block again // Skip through this block p = p->
in(0);
// Move control forward if(
_blocks.
size() ) {
// Did we do basic-block layout? }
else {
// Else do it with a DFS for (j = 0; j <
cnt; j++) {
"CreateEx must be first instruction in block");
for (
uint k = 0; k < n->
req(); k++) {
"must have block; constants for debug info ok");
// Verify that instructions in the block is in correct order. // Uses must follow their definition if they are at the same block. // Mostly done to check that MachSpillCopy nodes are placed correctly // when CreateEx node is moved in build_ifg_physical(). break;
// Some kind of loop assert(
bp,
"last instruction must be a block proj" );
assert(
bp == b->
_nodes[j],
"wrong number of successors for this block" );
//============================================================================= //------------------------------UnionFind-------------------------------------- // Force the Union-Find mapping to be at least this large // Initialize to be the ID mapping. //------------------------------Find_compress---------------------------------- // Straight out of Tarjan's union-find algorithm while(
next !=
cur ) {
// Scan chain of equivalences cur =
next;
// until find a fixed-point // Core of union-find algorithm: update chain of // equivalences to be equal to the root. //------------------------------Find_const------------------------------------- // Like Find above, but no path compress, so bad asymptotic behavior if(
idx == 0 )
return idx;
// Ignore the zero idx // Off the end? This can happen during debugging dumps // when data structures have not finished being updated. while(
next !=
idx ) {
// Scan chain of equivalences idx =
next;
// until find a fixed-point //------------------------------Union------------------------------------------ // union 2 sets together. for (
int i = 0; i <
count; i++) {
tty->
print(
" B%d --> B%d Freq: %f out:%3d%% in:%3d%% State: ",
//============================================================================= //------------------------------edge_order------------------------------------- // Comparison function for edges //------------------------------trace_frequency_order-------------------------- // Comparison function for edges // The trace of connector blocks goes at the end; // we only expect one such trace // Pull more frequently executed blocks to the beginning //------------------------------find_edges------------------------------------- // Find edges of interest, i.e, those which can fall through. Presumes that // edges which don't fall through are of low frequency and can be generally // ignored. Initialize the list of traces. // Walk the blocks, creating edges and Traces // All connector blocks should be at the end of the list // If this block and the next one have a one-to-one successor // predecessor relationship, simply append the next block // Skip over single-entry connector blocks, we don't want to // add them to the trace. // We see a merge point, so stop search for the next block // Create a CFGEdge for each outgoing // edge that could be a fall-through. // Group connector blocks into one trace //------------------------------union_traces---------------------------------- // Union two traces together in uf, and null out the trace in the list // If from is greater than to, swap values to meet // Union the lower with the higher and remove the pointer //------------------------------grow_traces------------------------------------- // Append traces together via the most frequently executed edges // Order the edges, and drive the growth of Traces via the most // frequently executed edges. // Don't grow traces along backedges? // If the edge in question can join two traces at their ends, // append one trace to the other. // Reset i to catch any newly eligible edge // (Or we could remember the first "open" edge, and reset there) //------------------------------merge_traces----------------------------------- // Embed one trace into another, if the fork or join points are sufficiently // Walk the edge list a another time, looking at unprocessed edges. // This may be a loop, but we can't do much about it. // If the edge links the middle of two traces, we can't do anything. // Mark the edge and continue. // Don't grow traces along backedges? // If both ends of the edge are available, why didn't we handle it earlier? // Insert the "targ" trace in the "src" trace if the insertion point // Better profitability check possible, but may not be worth it. // Someday, see if the this "fork" has an associated "join"; // then make a policy on merging this trace at the fork or join. // For example, other things being equal, it may be better to place this // trace at the join point if the "src" trace ends in a two-way, but // the insertion point is one-way. // Append traces, even without a fall-thru connection. // But leave root entry at the beginning of the block list. //----------------------------reorder_traces----------------------------------- // Order the sequence of the traces in some desirable way, and fixup the // jumps at the end of each block. for (
int i = 0; i <
count; i++) {
// The entry block should be first on the new trace list. // Sort the new trace list by frequency // Patch up the successor blocks //------------------------------PhaseBlockLayout------------------------------- // Order basic blocks based on frequency // Mapping block index --> block_trace // Find edges and create traces. // Grow traces at their ends via most frequent edges. // Merge one trace into another, but only at fall-through points. // This may make diamonds and other related shapes in a trace. // Run merge again, allowing two traces to be catenated, even if // one does not fall through into the other. This appends loosely // related traces to be near each other. // Re-order all the remaining traces by frequency //------------------------------backedge--------------------------------------- // Edge e completes a loop in a trace. If the target block is head of the // loop, rotate the loop block so that the loop ends in a conditional branch. // Find the last block in the trace that has a conditional // Rotate the loop by doing two-part linked-list surgery. // Backbranch to the top of a trace // Scroll forward through the trace from the targ_block. If we find // a loop head before another loop top, use the the loop head alignment. // Backbranch into the middle of a trace //------------------------------fixup_blocks----------------------------------- // push blocks onto the CFG list // ensure that blocks have the correct two-way branch sense // Ensure that the sense of the branch is correct // Fall-thru case in succs[0], should be in succs[1] // Flip targets in _succs map // Flip projections to match targets