chaitin.cpp revision 4022
3239N/A * Copyright (c) 2000, 2012, Oracle and/or its affiliates. All rights reserved. 0N/A * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 0N/A * This code is free software; you can redistribute it and/or modify it 0N/A * under the terms of the GNU General Public License version 2 only, as 0N/A * published by the Free Software Foundation. 0N/A * This code is distributed in the hope that it will be useful, but WITHOUT 0N/A * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 0N/A * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 0N/A * version 2 for more details (a copy is included in the LICENSE file that 0N/A * accompanied this code). 0N/A * You should have received a copy of the GNU General Public License version 0N/A * 2 along with this work; if not, write to the Free Software Foundation, 0N/A * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 1472N/A * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 0N/A//============================================================================= 0N/A//------------------------------score------------------------------------------ 0N/A// Compute score from cost and area. Low score is best to spill. 0N/A // Scale _area by RegisterCostAreaRatio/64K then subtract from cost. 0N/A // Bigger area lowers score, encourages spilling this live range. 0N/A // Bigger cost raise score, prevents spilling this live range. 0N/A // (Note: 1/65536 is the magic constant below; I dont trust the C optimizer 0N/A // to turn a divide by a constant into a multiply by the reciprical). 0N/A // Account for area. Basically, LRGs covering large areas are better 0N/A // to spill because more other LRGs get freed up. 0N/A if(
_area ==
0.0 )
// No area? Then no progress to spill 0N/A return score +
1e30;
// to make progress again. 0N/A return score +
1e17;
// Probably no progress to spill 0N/A return score +
1e10;
// Likely no progress to spill 0N/A//------------------------------LRG_List--------------------------------------- 0N/A//------------------------------Chaitin---------------------------------------- 0N/A // Build a list of basic blocks, sorted by frequency 0N/A // Experiment with sorting strategies to speed compilation 0N/A // Bump by three orders of magnitude each time 0N/A // Sort blocks into buckets 0N/A // Assign block to end of list for appropriate bucket 0N/A break;
// kick out of inner loop 0N/A // Dump buckets into final block array 0N/A // Above the OLD FP (and in registers) are the incoming arguments. Stack 0N/A // slots in this area are called "arg_slots". Above the NEW FP (and in 0N/A // registers) is the outgoing argument area; above that is the spill/temp 0N/A // area. These are all "frame_slots". Arg_slots start at the zero 0N/A // stack_slots and count up to the known arg_size. Frame_slots start at 0N/A // the stack_slot #arg_size and go up. After allocation I map stack 0N/A // slots to actual offsets. Stack-slots in the arg_slot area are biased 0N/A // by the frame_size; stack-slots in the frame_slot area are biased by 0. 0N/A // Need live-ness for the IFG; need the IFG for coalescing. If the 0N/A // liveness is JUST for coalescing, then I can get some mileage by renaming 0N/A // all copy-related live ranges low and then using the max copy-related 0N/A // live range as a cut-off for LIVE and the IFG. In other words, I can 0N/A // build a subset of LIVE and IFG just for copies. 0N/A // Need IFG for coalescing and coloring 0N/A // Come out of SSA world to the Named world. Assign (virtual) registers to 0N/A // Nodes. Use the same register for all inputs and the output of PhiNodes 0N/A // - effectively ending SSA form. This requires either coalescing live 0N/A // ranges or inserting copies. For the moment, we insert "virtual copies" 0N/A // - we pretend there is a copy prior to each Phi in predecessor blocks. 0N/A // We will attempt to coalesce such "virtual copies" before we manifest 566N/A // Veify the graph before RA. 0N/A // Base pointers are currently "used" by instructions which define new 0N/A // derived pointers. This makes base pointers live up to the where the 0N/A // derived pointer is made, but not beyond. Really, they need to be live 0N/A // across any GC point where the derived value is live. So this code looks 0N/A // at all the GC points, and "stretches" the live range of any base pointer 0N/A // Since some live range stretched, I need to recompute live 0N/A // Create the interference graph using virtual copies 0N/A // Aggressive (but pessimistic) copy coalescing. 0N/A // This pass works on virtual copies. Any virtual copies which are not 0N/A // coalesced get manifested as actual copies 0N/A // The IFG is/was triangular. I am 'squaring it up' so Union can run 0N/A // faster. Union requires a 'for all' operation which is slow on the 0N/A // triangular adjacency matrix (quick reminder: the IFG is 'sparse' - 0N/A // meaning I can visit all the Nodes neighbors less than a Node in time 0N/A // O(# of neighbors), but I have to visit all the Nodes greater than a 0N/A // given Node and search them for an instance, i.e., time O(#MaxLRG)). 0N/A // Insert un-coalesced copies. Visit all Phis. Where inputs to a Phi do 0N/A // not match the Phi itself, insert a copy. 0N/A // After aggressive coalesce, attempt a first cut at coloring. 0N/A // To color, we need the IFG and for that we need LIVE. 0N/A // Build physical interference graph 0N/A // If we have a guaranteed spill, might as well spill now 0N/A // Bail out if unique gets too large (ie - unique > MaxNodeLimit) 0N/A // Bail out if unique gets too large (ie - unique > MaxNodeLimit - 2*NodeLimitFudgeFactor) 0N/A // or we failed to split 0N/A compact();
// Compact LRGs; return new lower max lrg 0N/A // Only do conservative coalescing if requested 0N/A // Conservative (and pessimistic) copy coalescing of those spills 0N/A // If max live ranges greater than cutoff, don't color the stack. 0N/A // This cutoff can be larger than below since it is only done once. 0N/A // Prepare for Simplify & Select 0N/A // Simplify the InterFerence Graph by removing LRGs of low degree. 0N/A // LRGs of low degree are trivially colorable. 0N/A // Select colors by re-inserting LRGs back into the IFG in reverse order. 0N/A // Return whether or not something spills. 0N/A // If we spill, split and recycle the entire thing 0N/A // Bail out if unique gets too large (ie - unique > MaxNodeLimit - 2*NodeLimitFudgeFactor) 0N/A compact();
// Compact LRGs; return new lower max lrg 0N/A // Nuke the live-ness and interference graph and LiveRanGe info 0N/A // Create LiveRanGe array. 0N/A // Intersect register masks for all USEs and DEFs 0N/A // Only do conservative coalescing if requested 0N/A // Conservative (and pessimistic) copy coalescing 0N/A // Check for few live ranges determines how aggressive coalesce is. 0N/A // Simplify the InterFerence Graph by removing LRGs of low degree. 0N/A // LRGs of low degree are trivially colorable. 0N/A // Select colors by re-inserting LRGs back into the IFG in reverse order. 0N/A // Return whether or not something spills. 0N/A // Count number of Simplify-Select trips per coloring success. 0N/A // Peephole remove copies 566N/A // Veify the graph after RA. 0N/A // max_reg is past the largest *register* used. 0N/A // Convert that to a frame_slot number. 0N/A // This frame must preserve the required fp alignment 0N/A // Convert CISC spills 0N/A // Log regalloc results 0N/A // Move important info out of the live_arena to longer lasting storage. 3970N/A }
else {
// Must be a register-set 0N/A // Live ranges record the highest register in their mask. 0N/A // We want the low register for the AD file writer's convenience. 3970N/A // We have to use pair [lo,lo+1] even for wide vectors because 3970N/A // the rest of code generation works only with pairs. It is safe 3970N/A // since for registers encoding only 'lo' is used. 3970N/A // Second reg from pair is used in ScheduleAndBundle on SPARC where 3970N/A // vector max size is 8 which corresponds to registers pair. 3970N/A // It is also used in BuildOopMaps but oop operations are not 0N/A }
else {
// Misaligned; extract 2 bits 0N/A//------------------------------de_ssa----------------------------------------- 0N/A // Set initial Names for all Nodes. Most Nodes get the virtual register 0N/A // number. A few get the ZERO live range number. These do not 0N/A // get allocated, but instead rely on correct scheduling to ensure that 0N/A // only one instance is simultaneously live at a time. 0N/A // Handle all the normal Nodes in the block 0N/A // Pre-color to the zero live range, or pick virtual register 0N/A // Reset the Union-Find mapping to be identity 0N/A//------------------------------gather_lrg_masks------------------------------- 0N/A// Gather LiveRanGe information, including register masks. Modification of 0N/A// cisc spillable in_RegMasks should not be done before AggressiveCoalesce. 0N/A // Nail down the frame pointer live range 0N/A // For all instructions 0N/A // Get virtual register number, same as LiveRanGe index 0N/A if(
vreg ) {
// No vreg means un-allocable (e.g. memory) 0N/A // Collect has-copy bit 0N/A // Check for float-vs-int live range (used in register-pressure 0N/A // Check for twice prior spilling. Once prior spilling might have 0N/A // spilled 'soft', 2nd prior spill should have spilled 'hard' and 0N/A // further spilling is unlikely to make progress. 0N/A // collect defs for MultiDef printing 0N/A // Check for a single def LRG; these can spill nicely 0N/A // via rematerialization. Flag as NULL for no def found 0N/A // yet, or 'n' for single def or -1 for many defs. 0N/A // Limit result register mask to acceptable registers 0N/A "oops must be in Op_RegP's" );
3845N/A // Check for vector live range (only if vector register is used). 3845N/A // On SPARC vector uses RegD which could be misaligned so it is not 3845N/A // processes as vector in RA. 3845N/A "vector must be in vector registers");
3845N/A // Check for bound register masks 3845N/A // Check for maximum frequency value 0N/A // Check for multi-kill projection 0N/A // Fat projections have size equal to number of registers killed 0N/A // Register pressure is tracked relative to the maximum values 0N/A // suggested for that platform, INTPRESSURE and FLOATPRESSURE, 0N/A // and relative to other types which compete for the same regs. 0N/A // The following table contains suggested values based on the 0N/A // architectures as defined in each .ad file. 0N/A // INTPRESSURE and FLOATPRESSURE may be tuned differently for 0N/A // compile-speed or performance. 0N/A // SPARC and SPARCV9 reg_pressures are at 2 instead of 1 0N/A // since .ad registers are defined as high and low halves. 0N/A // These reg_pressure values remain compatible with the code 0N/A // in is_high_pressure() which relates get_invalid_mask_size(), 0N/A // Block::_reg_pressure and INTPRESSURE, FLOATPRESSURE. 0N/A // SPARC -d32 has 24 registers available for integral values, 0N/A // but only 10 of these are safe for 64-bit longs. 0N/A // Using set_reg_pressure(2) for both int and long means 0N/A // the allocator will believe it can fit 26 longs into 0N/A // registers. Using 2 for longs and 1 for ints means the 0N/A // allocator will attempt to put 52 integers into registers. 0N/A // The settings below limit this problem to methods with 0N/A // many long values which are being run on 32-bit SPARC. 0N/A // ------------------- reg_pressure -------------------- 0N/A // Each entry is reg_pressure_per_value,number_of_regs 0N/A // RegL RegI RegFlags RegF RegD INTPRESSURE FLOATPRESSURE 0N/A // IA32 2 1 1 1 1 6 6 0N/A // IA64 1 1 1 1 1 50 41 0N/A // SPARC 2 2 2 2 2 48 (24) 52 (26) 0N/A // SPARCV9 2 2 2 2 2 48 (24) 52 (26) 0N/A // AMD64 1 1 1 1 1 14 15 0N/A // ----------------------------------------------------- 0N/A // Define platform specific register pressure 0N/A // If this def of a double forces a mis-aligned double, 0N/A // flag as '_fat_proj' - really flag as allowing misalignment 0N/A // AND changes how we count interferences. A mis-aligned 0N/A // double can interfere with TWO aligned pairs, or effectively 0N/A case 0:
// not an ideal register 0N/A // Now do the same for inputs 0N/A // Setup for CISC SPILLING 0N/A // Convert operand number to edge index number 0N/A // Prepare register mask for each input 0N/A // If this instruction is CISC Spillable, add the flags 0N/A // bit to its appropriate input 0N/A // // Testing for floating point code shape 0N/A // Node *test = n->in(k); 0N/A // if( test->is_Mach() ) { 0N/A // MachNode *m = test->as_Mach(); 0N/A // int op = m->ideal_Opcode(); 0N/A // if (n->is_Call() && (op == Op_AddF || op == Op_MulF) ) { 0N/A // Limit result register mask to acceptable registers. 0N/A // Do not limit registers from uncommon uses before 0N/A // AggressiveCoalesce. This effectively pre-virtual-splits 0N/A // around uncommon uses of common defs. 0N/A // Since we are BEFORE aggressive coalesce, leave the register 0N/A // mask untrimmed by the call. This encourages more coalescing. 0N/A // Later, AFTER aggressive, this live range will have to spill 0N/A // but the spiller handles slow-path calls very nicely. 0N/A // Check for bound register masks 3845N/A "vector must be in vector registers");
0N/A // If this use of a double forces a mis-aligned double, 0N/A // flag as '_fat_proj' - really flag as allowing misalignment 0N/A // AND changes how we count interferences. A mis-aligned 0N/A // double can interfere with TWO aligned pairs, or effectively 0N/A // if the LRG is an unaligned pair, we will have to spill 0N/A // so clear the LRG's register mask if it is not already spilled 0N/A // Check for maximum frequency value 0N/A }
// End for all allocated inputs 0N/A }
// end for all instructions 0N/A }
// end for all blocks 0N/A // Final per-liverange setup 0N/A//------------------------------set_was_low------------------------------------ 0N/A// Set the was-lo-degree bit. Conservative coalescing should not change the 0N/A// colorability of the graph. If any live range was of low-degree before 0N/A// coalescing, it should Simplify. This call sets the was-lo-degree bit. 0N/A// The bit is checked in Simplify. 0N/A }
else {
// Else check the Brigg's assertion 0N/A // Brigg's observation is that the lo-degree neighbors of a 0N/A // hi-degree live range will not interfere with the color choices 0N/A // of said hi-degree live range. The Simplify reverse-stack-coloring 0N/A // order takes care of the details. Hence you do not have to count 0N/A // low-degree neighbors when determining if this guy colors. 0N/A//------------------------------cache_lrg_info--------------------------------- 0N/A// Compute cost/area ratio, in case we spill. Build the lo-degree list. 0N/A // Check for being of low degree: means we can be trivially colored. 0N/A // Low degree, dead or must-spill guys just get to simplify right away 0N/A // Split low degree list into those guys that must get a 0N/A // register and those that can go to register or stack. 0N/A // The idea is LRGs that can go register or stack color first when 0N/A // they have a good chance of getting a register. The register-only 0N/A // lo-degree live ranges always get a register. 0N/A }
else {
// Else high degree 0N/A//------------------------------Pre-Simplify----------------------------------- 0N/A// Simplify the IFG by removing LRGs of low degree that have NO copies 0N/A // Warm up the lo-degree no-copy list 0N/A // Put the simplified guy on the simplified list. 0N/A // Yank this guy from the IFG. 0N/A // If any neighbors' degrees fall below their number of 0N/A // allowed registers, then put that neighbor on the low degree 0N/A // list. Note that 'degree' can only fall and 'numregs' is 0N/A // unchanged by this action. Thus the two are equal at most once, 0N/A // so LRGs hit the lo-degree worklists at most once. 0N/A // Check for just becoming of-low-degree 0N/A // Put on lo-degree list 0N/A }
// End of while lo-degree no_copy worklist not empty 0N/A // No more lo-degree no-copy live ranges to simplify 0N/A//------------------------------Simplify--------------------------------------- 0N/A// Simplify the IFG by removing LRGs of low degree. 0N/A while(
1 ) {
// Repeat till simplified it all 0N/A // May want to explore simplifying lo_degree before _lo_stk_degree. 0N/A // This might result in more spills coloring into registers during 0N/A // If possible, pull from lo_stk first 0N/A // Put the simplified guy on the simplified list. 0N/A // If this guy is "at risk" then mark his current neighbors 0N/A // Yank this guy from the IFG. 0N/A // If any neighbors' degrees fall below their number of 0N/A // allowed registers, then put that neighbor on the low degree 0N/A // list. Note that 'degree' can only fall and 'numregs' is 0N/A // unchanged by this action. Thus the two are equal at most once, 0N/A // so LRGs hit the lo-degree worklist at most once. 0N/A // Check for just becoming of-low-degree just counting registers. 0N/A // _must_spill live ranges are already on the low degree list. 0N/A // Pull from hi-degree list 0N/A // Check for got everything: is hi-degree list empty? 0N/A // Time to pick a potential spill guy 0N/A // Find cheapest guy 0N/A // It's just vaguely possible to move hi-degree to lo-degree without 0N/A // going through a just-lo-degree stage: If you remove a double from 0N/A // a float live range it's degree will drop by 2 and you can skip the 0N/A // just-lo-degree stage. It's very rare (shows up after 5000+ methods 0N/A // in -Xcomp of Java2Demo). So just choose this guy to simplify next. 0N/A // wins. Ties happen because all live ranges in question have spilled 0N/A // a few times before and the spill-score adds a huge number which 0N/A // washes out the low order bits. We are choosing the lesser of 2 0N/A // evils; in this case pick largest area to spill. 1008N/A // Ties also happen when live ranges are defined and used only inside 1008N/A // one block. In which case their area is 0 and score set to max. 1008N/A // In such case choose bound live range over unbound to free registers 1008N/A // or with smaller cost to spill. 0N/A // The live range we choose for spilling is either hi-degree, or very 0N/A // rarely it can be low-degree. If we choose a hi-degree live range 0N/A // there better not be any lo-degree choices. 0N/A // Pull from hi-degree list 0N/A // Jam him on the lo-degree list, despite his high degree. 0N/A // Maybe he'll get a color, and maybe he'll spill. 0N/A // Only Select() will know. 0N/A }
// End of while not simplified everything 3970N/A//------------------------------is_legal_reg----------------------------------- 3970N/A// Is 'reg' register legal for 'lrg'? 3970N/A // RA uses OptoReg which represent the highest element of a registers set. 3970N/A // For example, vectorX (128bit) on x86 uses [XMM,XMMb,XMMc,XMMd] set 3970N/A // in which XMMd is used by RA to represent such vectors. A double value 3970N/A // uses [XMM,XMMb] pairs and XMMb is used by RA for it. 3970N/A // The register mask uses largest bits set of overlapping register sets. 3970N/A // On x86 with AVX it uses 8 bits for each XMM registers set. 3970N/A // The 'lrg' already has cleared-to-set register mask (done in Select() 3970N/A // before calling choose_color()). Passing mask.Member(reg) check above 3970N/A // indicates that the size (num_regs) of 'reg' set is less or equal to 3970N/A // For set size 1 any register which is member of 'lrg' mask is legal. 3970N/A // For larger sets only an aligned register with the same set size is legal. 0N/A//------------------------------bias_color------------------------------------- 0N/A// Choose a color using the biasing heuristic 0N/A // Check for "at_risk" LRG's 0N/A // Walk the colored neighbors of the "at_risk" candidate 0N/A // Choose a color which is both legal and already taken by a neighbor 0N/A // of the "at_risk" candidate in order to improve the chances of the 0N/A // "at_risk" candidate of coloring 0N/A // If this LRG's register is legal for us, choose it 0N/A // If he has a color, 0N/A // And it is legal for you, 0N/A // Choose a color which is legal for him 0N/A // If no bias info exists, just go with the register selection ordering 0N/A // CNC - Fun hack. Alternate 1st and 2nd selection. Enables post-allocate 0N/A // copy removal to remove many more copies, by preventing a just-assigned 0N/A // register from being repeatedly assigned. 0N/A // This 'Remove; find; Insert' idiom is an expensive way to find the 0N/A // SECOND element in the mask. 0N/A//------------------------------choose_color----------------------------------- 0N/A// Choose a color in the current chunk 0N/A // Use a heuristic to "bias" the color choice 0N/A // Fat-proj case or misaligned double argument. 0N/A // Return the highest element in the set. 0N/A//------------------------------Select----------------------------------------- 0N/A// Select colors by re-inserting LRGs back into the IFG. LRGs are re-inserted 0N/A// in reverse order of removal. As long as nothing of hi-degree was yanked, 0N/A// everything going back is guaranteed a color. Select that color. If some 0N/A// hi-degree LRG cannot get a color then we record that we must spill. 0N/A // Pull next LRG from the simplified list - in reverse order of removal 0N/A // Re-insert into the IFG 0N/A // capture allstackedness flag before mask is hacked 0N/A // Yeah, yeah, yeah, I know, I know. I can refactor this 0N/A // to avoid the GOTO, although the refactored code will not 0N/A // be much clearer. We arrive here IFF we have a stack-based 0N/A // live range that cannot color in the current chunk, and it 0N/A // has to move into the next free stack chunk. 0N/A int chunk = 0;
// Current chunk is first chunk 0N/A // Remove neighbor colors 0N/A // Note that neighbor might be a spill_reg. In this case, exclusion 0N/A // of its color will be a no-op, since the spill_reg chunk is in outer 0N/A // space. Also, if neighbor is in a different chunk, this exclusion 0N/A // will be a no-op. (Later on, if lrg runs out of possible colors in 0N/A // its chunk, a new chunk of color may be tried, in which case 0N/A // examination of neighbors is started again, at retry_next_chunk.) 0N/A // Only subtract masks in the same chunk 0N/A //assert(is_allstack == lrg->mask().is_AllStack(), "nbrs must not change AllStackedness"); 0N/A // Aligned pairs need aligned masks 0N/A // Check if a color is available and if so pick the color 0N/A // If we fail to color and the AllStack flag is set, trigger 0N/A // a chunk-rollover event 0N/A // Bump register mask up to next stack chunk 0N/A // Did we get a color? 0N/A // Record selected register 0N/A // Fold reg back into normal space 0N/A // If the live range is not bound, then we actually had some choices 0N/A // to make. In this case, the mask has more bits in it than the colors 605N/A // chosen. Restrict the mask to just what was picked. 3845N/A // For vectors and pairs, also insert the low bit of the pair 0N/A }
else {
// Else fatproj 0N/A // mask must be equal to fatproj bits, by definition 0N/A // Note that reg is the highest-numbered register in the newly-bound mask. 0N/A }
// end color available case 0N/A // Live range is live and no colors available 0N/A // Assign the special spillreg register 0N/A // Do not empty the regmask; leave mask_size lying around 0N/A // for use during Spilling 0N/A//------------------------------copy_was_spilled------------------------------- 0N/A// Copy 'was_spilled'-edness from the source Node to the dst Node. 0N/A//------------------------------set_was_spilled-------------------------------- 0N/A// Set the 'spilled_once' or 'spilled_twice' flag on a node. 0N/A//------------------------------fixup_spills----------------------------------- 0N/A// Convert Ideal spill instructions into proper FramePtr + offset Loads and 0N/A// Stores. Use-def chains are NOT preserved, but Node->LRG->reg maps are. 0N/A // This function does only cisc spill work. 0N/A // Grab the Frame Pointer 0N/A // For all instructions in block 0N/A // Dead instruction??? 0N/A C->
top() == n ||
// Or the random TOP node 0N/A "No dead instructions after post-alloc" );
0N/A // Convert operand number to edge index number 0N/A // Doubles record the HIGH register of an adjacent pair. 0N/A // This is a CISC Spill, get stack offset and construct new node 0N/A // Bailout if we might exceed node limit when spilling this instruction 0N/A }
// End of for all instructions 0N/A }
// End of for all blocks 0N/A//------------------------------find_base_for_derived-------------------------- 0N/A// Helper to stretch above; recursively discover the base Node for a 0N/A// given derived Node. Easy for AddP-related machine nodes, but needs 0N/A// to be recursive for derived Phis. 0N/A // See if already computed; if so return it 0N/A // See if this happens to be a base. 0N/A // NOTE: we use TypePtr instead of TypeOopPtr because we can have 0N/A // pointers derived from NULL! These are always along paths that 0N/A // can't happen at run-time but the optimizer cannot deduce it so 0N/A // we have to handle it gracefully. 0N/A // If its an OOP with a non-zero offset, then it is derived. 0N/A // Derived is NULL+offset? Base is NULL! 729N/A // Initialize it once and make it shared: 729N/A // set control to _root and place it into Start block 729N/A // (where top() node is placed). 0N/A // Check for AddP-related opcodes 0N/A // Recursively find bases for Phis. 0N/A // First check to see if we can avoid a base Phi here. 0N/A // Went to the end without finding any different bases? 0N/A // Now we see we need a base-Phi here to merge the bases 0N/A // Search the current block for an existing base-Phi 0N/A for( i =
1; i <= b->
end_idx(); i++ ) {
// Search for matching Phi 0N/A // See if Phi matches. 0N/A base =
phi;
// Then use existing 'phi' and drop 'base' 0N/A // Cache info for later passes 0N/A//------------------------------stretch_base_pointer_live_ranges--------------- 0N/A// At each Safepoint, insert extra debug edges for each pair of derived value/ 0N/A// base pointer that is live across the Safepoint for oopmap building. The 0N/A// edge pairs get added in after sfpt->jvmtail()->oopoff(), but are in the 0N/A// required edge set. 0N/A // For all blocks in RPO do... 0N/A // Note use of deep-copy constructor. I cannot hammer the original 0N/A // liveout bits, because they are needed by the following coalesce pass. 0N/A // Pre-split compares of loop-phis. Loop-phis form a cycle we would 0N/A // like to see in the same register. Compare uses the loop-phi and so 0N/A // extends its live range BUT cannot be part of the cycle. If this 0N/A // extended live range overlaps with the update of the loop-phi value 0N/A // we need both alive at the same time -- which requires at least 1 0N/A // copy. But because Intel has only 2-address registers we end up with 0N/A // at least 2 copies, one before the loop-phi update instruction and 0N/A // one after. Instead we split the input to the compare just after the 0N/A // Get value being defined 0N/A // Remove from live-out set 0N/A // Copies do not define a new value and so do not interfere. 0N/A // Remove the copies source from the liveout set before interfering. 0N/A // Found a safepoint? 0N/A // Now scan for a live derived pointer 0N/A // Find reaching DEF for base and derived values 0N/A // This works because we are still in SSA during this call. 0N/A // If its an OOP with a non-zero offset, then it is derived. 0N/A // Add reaching DEFs of derived pointer and base pointer as a 0N/A // See if the base pointer is already live to this point. 0N/A // Since I'm working on the SSA form, live-ness amounts to 0N/A // reaching def's. So if I find the base's live range then 0N/A // I know the base's def reaches here. 0N/A // Base pointer is not currently live. Since I stretched 0N/A // the base pointer to here and it crosses basic-block 0N/A // boundaries, the global live info is now incorrect. 0N/A }
// End of if base pointer is not live to debug info 0N/A }
// End of scan all live data for derived ptrs crossing GC point 0N/A }
// End of if found a GC point 0N/A // Make all inputs live 0N/A if( !n->
is_Phi() ) {
// Phi function uses come from prior block 0N/A }
// End of forall instructions in block 0N/A }
// End of forall blocks 0N/A // If I created a new live range I need to recompute live 0N/A//------------------------------add_reference---------------------------------- 0N/A// Extend the node to LRG mapping 0N/A//------------------------------dump------------------------------------------- 0N/A for( k = 0; k < n->
req(); k++) {
0N/A // Data MultiNode's can have projections with no real registers. 0N/A // Don't die while dumping them. 0N/A // For all instructions 0N/A // Print live-out info at end of block 0N/A tty->
print(
"--- Chaitin -- argsize: %d framesize: %d ---\n",
0N/A // End of per-block dump 0N/A // Dump lo-degree list 0N/A // Dump lo-stk-degree list 0N/A // Dump lo-degree list 0N/A//------------------------------dump_degree_lists------------------------------ 0N/A // Dump lo-degree list 0N/A // Dump lo-stk-degree list 0N/A // Dump lo-degree list 0N/A//------------------------------dump_simplified-------------------------------- 0N/A//------------------------------dump_register---------------------------------- 0N/A// Dump a register name into a buffer. Be intelligent if we get called 0N/A// before allocation is complete. 0N/A if( !
this ) {
// Not got anything? 0N/A // Post allocation, use direct mappings, no LRG info available 0N/A }
else if( !
lidx ) {
// Special, not allocated value 3845N/A // Hah! We have a bound machine register 0N/A//----------------------dump_for_spill_split_recycle-------------------------- 0N/A // Display which live ranges need to be split and the allocator's state 0N/A//------------------------------dump_frame------------------------------------ 0N/A // Incoming arguments in registers dump 0N/A // Check for un-owned padding above incoming args 0N/A // Incoming argument area dump 0N/A // Old outgoing preserve area 0N/A // Preserve area dump 3239N/A // Preserved slots are present on x86 0N/A // Outgoing argument area dump 0N/A // Outgoing new preserve area 0N/A//------------------------------dump_bb---------------------------------------- 0N/A//------------------------------dump_lrg--------------------------------------- 0N/A tty->
print(
"Attempt to print live range index beyond max live range.\n");
0N/A // For all instructions 1923N/A if (!m)
continue;
// be robust in the dumper 0N/A }
// End of per-block dump 0N/A#
endif // not PRODUCT 0N/A//------------------------------print_chaitin_statistics------------------------------- 0N/A tty->
print(
"Conservatively coalesced %d copies, %d pairs",
0N/A tty->
print_cr(
"Lost coalesce opportunity, %d private-private, and %d cflow interfered.",
0N/A#
endif // not PRODUCT