matcher.cpp revision 2902
2362N/A * Copyright (c) 1997, 2011, Oracle and/or its affiliates. All rights reserved. 1096N/A * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 1096N/A * This code is free software; you can redistribute it and/or modify it 1096N/A * under the terms of the GNU General Public License version 2 only, as 2362N/A * published by the Free Software Foundation. 2362N/A * This code is distributed in the hope that it will be useful, but WITHOUT 1096N/A * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 1096N/A * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 1096N/A * version 2 for more details (a copy is included in the LICENSE file that 1096N/A * You should have received a copy of the GNU General Public License version 1096N/A * 2 along with this work; if not, write to the Free Software Foundation, 1096N/A * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 1096N/A * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 1096N/A//---------------------------Matcher------------------------------------------- 1096N/A//------------------------------warp_incoming_stk_arg------------------------ 1096N/A// This warps a VMReg into an OptoReg::Name 1096N/A // the compiler cannot represent this method's calling sequence 1096N/A//---------------------------compute_old_SP------------------------------------ 1096N/A // Make sure that the new graph only references new nodes //---------------------------match--------------------------------------------- assert(
false,
"invalid MaxLabelRootDepth, increase it to 100 minimum");
// One-time initialization of some register masks. // Pointers take 2 slots in 64-bit land // Map a Java-signature return type into return register-value // machine registers for 0, 1 and 2 returned values. // Get ideal-register return type // Get machine return register // Need the method signature to determine the incoming argument types, // because the types determine which registers the incoming arguments are // in, and this affects the matched code. // Pass array of ideal registers and length to USER code (from the AD file) // that will convert this to an array of register numbers. // Sanity check users' calling convention. Real handy while trying to // get the initial port correct. "parameters in register must be preserved by runtime stubs");
for (
uint j = 0; j < i; j++) {
"calling conv. must produce distinct regs");
// Do some initial frame layout. // Compute the old incoming SP (may be called FP) as // OptoReg::stack0() + locks + in_preserve_stack_slots + pad2. // Compute highest incoming stack argument as // _old_SP + out_preserve_stack_slots + incoming argument size. for( i = 0; i <
argcnt; i++ ) {
// Permit args to have no register // calling_convention returns stack arguments as a count of // slots beyond OptoReg::stack0()/VMRegImpl::stack0. We need to convert this to // the allocators point of view, taking into account all the // preserve area, locks & pad2. // Saved biased stack-slot register number // Finally, make sure the incoming arguments take up an even number of // words, in case the arguments or locals need to contain doubleword stack // slots. The rest of the system assumes that stack slot pairs (in // particular, in the spill area) which look aligned will in fact be // aligned relative to the stack pointer in the target machine. Double // stack slots will always be allocated aligned. // Compute highest outgoing stack argument as // _new_SP + out_preserve_stack_slots + max(outgoing argument size). // the compiler cannot represent this method's calling sequence if (C->
failing())
return;
// bailed out on incoming arg failure // Collect roots of matcher trees. Every node for which // _shared[_idx] is cleared is guaranteed to not be shared, and thus // can be a valid interior of some tree. // Create new ideal node ConP #NULL even if it does exist in old space // to avoid false sharing if the corresponding mach node is not used. // The corresponding mach node is only used in rare cases for derived // Swap out to old-space; emptying new-space // Save debug and profile information for nodes in old space: // Pre-size the new_node table to avoid the need for range checks. // Reset node counter so MachNodes start with _idx at 0 // Recursively match trees from old space into new space. // Correct leaves of new-space Nodes; they point to old-space. // During matching shared constants were attached to C->root() // because xroot wasn't available yet, so transfer the uses to // Generate new mach node for ConP #NULL // Don't set control, it will confuse GCM since there are no uses. // The control will be set when this node is used first time // in find_base_for_derived(). // ------------------------ // Set up save-on-entry registers //------------------------------Fixup_Save_On_Entry---------------------------- // The stated purpose of this routine is to take care of save-on-entry // registers. However, the overall goal of the Match phase is to convert into // machine-specific instructions which have RegMasks to guide allocation. // So what this procedure really does is put a valid RegMask on each input // to the machine-specific variations of all Return, TailCall and Halt // instructions. It also adds edgs to define the save-on-entry values (and of // course gives them a mask). // Do all the pre-defined register masks //---------------------------init_first_stack_mask----------------------------- // Create the initial stack mask used by values spilling to the stack. // Disallow any debug info in outgoing argument areas by setting the // initial mask accordingly. // Allocate storage for spill masks as masks for the appropriate load type. // At first, start with the empty mask // Add in the incoming argument area // Add in all bits past the outgoing argument area "must be able to represent all call arguments in reg mask");
// Finally, set the "infinite stack" bit. // Make spill masks. Registers for their class, plus FIRST_STACK_mask. // This mask logic assumes that the spill operations are // symmetric and that the registers involved are the same size. // On sparc for instance we may have to use 64 bit moves will // kill 2 registers when used with F0-F31. // ARM has support for moving 64bit values between a pair of // integer registers and a double register // Make up debug masks. Any spill slot plus callee-save registers. // Caller-save registers are assumed to be trashable by the various // inline-cache fixup routines. // Prevent stub compilations from attempting to reference // callee-saved registers from debug info // registers the caller has to save do not work // Subtract the register we use to save the SP for MethodHandle // invokes to from the debug mask. //---------------------------is_save_on_entry---------------------------------- // Also save argument registers in the trampolining stubs //---------------------------Fixup_Save_On_Entry------------------------------- // Count number of save-on-entry registers. // Find the procedure Start Node // Save argument registers in the trampolining stubs // Input RegMask array shared by all Returns. // The type for doubles and longs has a count of 2, but // there is only 1 returned value // Returns have 0 or 1 returned values depending on call signature. // Return register is specified by return_value in the AD file. // Input RegMask array shared by all Rethrows. // Rethrow takes exception oop only, but in the argument 0 slot. // Need two slots for ptrs in 64-bit land // Input RegMask array shared by all TailCalls // Input RegMask array shared by all TailJumps // TailCalls have 2 returned values (target & moop), whose masks come // TailCall to extract these masks and put the correct masks into // the tail_call_rms array. // TailJumps have 2 returned values (target & ex_oop), whose masks come // TailJump to extract these masks and put the correct masks into // the tail_jump_rms array. // Input RegMask array shared by all Halts // Capture the return input masks into each exit flavor // Next unused projection number from Start. // Do all the save-on-entry registers. Make projections from Start for // them, and give them a use at the exit points. To the allocator, they // look like incoming register arguments. // Add the save-on-entry to the mask array // Halts need the SOE registers, but only in the stack as debug info. // A just-prior uncommon-trap or deoptimization will use the SOE regs. // Is this a RegF low half of a RegD? Double up 2 adjacent RegF's // Add other bit for double else if( (i&
1) ==
1 &&
// Else check for high half of double // Is this a RegI low half of a RegL? Double up 2 adjacent RegI's // Add other bit for long else if( (i&
1) ==
1 &&
// Else check for high half of long // Make a projection for it off the Start // Add a use of the SOE register to all exit paths }
// End of if a save-on-entry register }
// End of for all machine registers//------------------------------init_spill_mask-------------------------------- // pointers are twice as big // Start at OptoReg::stack0() // STACK_ONLY_mask is all stack bits // Also set the "infinite stack" bit. // Copy the register names over into the shared world // SharedInfo::regName[i] = regName[i]; // Handy RegMasks per machine register // Grab the Frame Pointer // Share frame pointer while making spill ops // Compute generic short-offset Loads // Get the ADLC notion of the right regmask, for each basic type. if (!
VerifyAliases)
return;
// do not go looking for trouble by default // Detune the assert for cases like (AndI 0xFF (LoadB p)). for (
uint i =
1; i < n->
req(); i++) {
// %%% Kludgery. Instead, fix ideal adr_type methods for all these cases: "must not lose alias info when matching");
//------------------------------MStack----------------------------------------- // State and MStack class used in xform() and find_shared() iterative methods. //------------------------------xform------------------------------------------ // Given a Node in old-space, Match him (Label/Reduce) to produce a machine // Node in new-space. Given a new-space Node, recursively walk his children. // Old-space or new-space check // Calls match special. They match alone with no children. // Their children, the incoming arguments, match normally. }
else {
// Nothing the matcher cares about // Convert to machine-dependent projection if (m->
in(0) !=
NULL)
// m might be top }
else {
// Else just a regular 'ol guy m = n->
clone();
// So just clone into new-space // Def-Use edges will be added incrementally as Uses // of this node are matched. n = m;
// n is now a new-space node // Put precedence edges on stack first (match them last). // set -1 to call add_prec() instead of set_req() during Step1 // For constant debug info, I'd rather have unmatched constants. // Now do only debug info. Clone constants rather than matching. // Constants are represented directly in the debug info without // the need for executable machine instructions. // Monitor boxes are also represented directly. for (i =
cnt -
1; i >=
debug_cnt; --i) {
// For all debug inputs do Node *m = n->
in(i);
// Get input // || op == Op_BoxLock // %%%% enable this and remove (+++) in chaitin.cpp // And now walk his children, and convert his inputs to new-space. for( ; i >= 0; --i ) {
// For all normal inputs do Node *m = n->
in(i);
// Get input if (p !=
NULL) {
// root doesn't have parent p->
set_req(i, n);
// required input mstack.
pop();
// remove processed node from stack }
// while (mstack.is_nonempty()) return n;
// Return new-space Node //------------------------------warp_outgoing_stk_arg------------------------ // Convert outgoing argument location to a pre-biased stack offset // Adjust the stack slot offset to be the register number used // Keep track of the largest numbered stack slot used for an arg. // Largest used slot per call-site indicates the amount of stack // that is killed by the call. //------------------------------match_sfpt------------------------------------- // Helper function to match call instructions. Calls match special. // They match alone with no children. Their children, the incoming // arguments, match normally. // Split out case for SafePoint vs Call // Match just the call, nothing else // Copy data from the Ideal SafePoint to the machine version // This is a non-call safepoint // Advertise the correct memory effects (for anti-dependence computation). // Allocate a private array of RegMasks. These RegMasks are not shared. // Do all the pre-defined non-Empty register masks // Place first outgoing argument can possibly be put. // Compute max outgoing register number per call site. // Calls to C may hammer extra stack slots above and beyond any arguments. // These are usually backing store for register arguments for varargs. // Do the normal argument list (parameters) register masks if(
argcnt > 0 ) {
// Skip it all if we have no args for( i = 0; i <
argcnt; i++ ) {
// V-call to pick proper calling convention // Sanity check users' calling convention. Really handy during // the initial porting effort. Fairly expensive otherwise. {
for (
int i = 0; i<
argcnt; i++) {
for (
int j = 0; j < i; j++) {
// Visit each argument. Compute its outgoing register mask. // Return results now can have 2 bits returned. // Compute max over all outgoing arguments both per call-site // and over the entire method. for( i = 0; i <
argcnt; i++ ) {
// Address of incoming argument mask to fill in continue;
// Avoid Halves // Grab first register, adjust stack slots and insert in mask. // Grab second register (if any), adjust stack slots and insert in mask. }
// End of for all arguments // Compute number of stack slots needed to restore stack in case of // Pascal-style argument popping. // Kill some extra stack space in case method handles want to do // a little in-place argument insertion. // Do not update mcall->_argsize because (a) the extra space is not // pushed as arguments and (b) _argsize is dead (not used anywhere). // Compute the max stack slot killed by any call. These will not be // available for debug info, and will be used to adjust FIRST_STACK_mask // after all call sites have been visited. // Kill the outgoing argument area, including any non-argument holes and // any legacy C-killed slots. Use Fat-Projections to do the killing. // Since the max-per-method covers the max-per-call-site and debug info // is excluded on the max-per-method basis, debug info cannot land in // Transfer the safepoint information from the call to the mcall // Move the JVMState list // Debug inputs begin just after the last incoming parameter // Registers killed by the call are set in the local scheduling pass // of Global Code Motion. //---------------------------match_tree---------------------------------------- // Match a Ideal Node DAG - turn it into a tree; Label & Reduce. Used as part // of the whole-sale conversion from Ideal to Mach Nodes. Also used for // making GotoNodes while building the CFG and in init_spill_mask() to identify // a Load's result RegMask for memoization in idealreg2regmask[] // Set the mark for all locally allocated State objects. // When this call returns, the _states_arena arena will be reset // freeing all State objects. // StoreNodes require their Memory input to match any LoadNodes // State object for root node of match tree // Allocate it on _states_arena - stack allocation can cause stack overflow. // Label the input tree, allocating labels from top-level arena // The minimum cost match for the whole tree is found at the root State if( s->
valid(i) &&
// valid entry and // Reduce input tree based upon the state labels to machine Nodes // Add any Matcher-ignored edges //------------------------------match_into_reg--------------------------------- // Choose to either match this Node in a register or part of the current // match tree. Return true for requiring a register and false for matching // as part of the current match tree. // Never force constants into registers. Allow them to match as // constants or registers. Copies of the same value will share // the same register. See find_shared_node. }
else {
// Not a constant // Stop recursion if they have different Controls. // Slot 0 of constants is not really a Control. // Actually, we can live with the most conservative control we // find, if it post-dominates the others. This allows us to // pick up load/op/store trees where the load can float a little if( x->
is_Region() )
// Bail out at merge points if( x == m->
in(0) )
// Does 'control' post-dominate break;
// m->in(0)? If so, we can use it if( j ==
max_scan )
// No post-domination before scan end? return true;
// Then break the match tree up // These are commonly used in address expressions and can // efficiently fold into them on X64 in some cases. // Not forceable cloning. If shared, put it into a register. //------------------------------Instruction Selection-------------------------- // Label method walks a "tree" of nodes, using the ADLC generated DFA to match // ideal nodes to machine instructions. Trees are delimited by shared Nodes, // things the Matcher does not match (e.g., Memory), and things with different // Controls (hence forced into different blocks). We pass in the Control // selected for this entire State tree. // The Matcher works on Trees, but an Intel add-to-memory requires a DAG: the // Store and the Load must have identical Memories (as well as identical // pointers). Since the Matcher does not have anything for Memory (and // does not handle DAGs), I have to match the Memory input myself. If the // Tree root is a Store, I require all Loads to have the identical memory. // Since Label_Root is a recursive function, its possible that we might run // out of stack space. See bugs 6272980 & 6227033 for more info. uint care = 0;
// Edges matcher cares about // Examine children for memory state // Can only subsume a child into your match-tree if that child's memory state // is not modified along the path to another input. // It is unsafe even if the other inputs are separate roots. for( i =
1; i <
cnt; i++ ) {
Node *m = n->
in(i);
// Get ith input assert( m,
"expect non-null children" );
for( i =
1; i <
cnt; i++ ){
// For my children Node *m = n->
in(i);
// Get ith input // Allocate states out of a private arena // Recursively label the State tree. // Check for leaves of the State Tree; things that cannot be a part of // the current tree. If it finds any, that value is matched as a // register operand. If not, then the normal matching is used. // Stop recursion if this is LoadNode and the root of this tree is a // StoreNode and the load & store have different memories. // Can NOT include the match of a subtree when its memory state // is used by any of the other subtrees // Print when we exclude matching due to different memory states at input-loads // Switch to a register-only opcode; this value must be in a register // and cannot be subsumed as part of a larger instruction. // If match tree has no control and we do, adopt it for entire tree // Else match as a normal part of the match tree. // Call DFA to match this node, and return assert(
false,
"bad AD file" );
// Con nodes reduced using the same rule can share their MachNode // which reduces the number of copies of a constant in the final // program. The register allocator is free to split uses later to // See if this Con has already been reduced using this rule. // Don't expect control change for DecodeN // Get the new space root. // This shouldn't happen give the order of matching. // Shared constants need to have their control be root so they // can be scheduled properly. assert(
false,
"unexpected control");
//------------------------------ReduceInst------------------------------------- // Reduce a State tree (with given Control) into a tree of MachNodes. // This routine (and it's cohort ReduceOper) convert Ideal Nodes into // complicated machine Nodes. Each MachNode covers some tree of Ideal Nodes. // Each MachNode has a number of complicated MachOper operands; each // MachOper also covers a further tree of Ideal Nodes. // The root of the Ideal match tree is always an instruction, so we enter // the recursion here. After building the MachNode, we need to recurse // the tree checking for these cases: // (1) Child is an instruction - // Build the instruction (recursively), add it as an edge. // Build a simple operand (register) to hold the result of the instruction. // (2) Child is an interior part of an instruction - // Skip over it (do nothing) // (3) Child is the start of a operand - // Build the operand, place it inside the instruction // Build the object to represent this state & prepare for recursive calls // Check for instruction or instruction chain rule "duplicating node that's already been matched");
// Reduce interior of complex instruction // Instruction chain rules are data-dependent on their inputs // If a Memory was used, insert a Memory edge // Verify adr type after matching memory operation // It has a unique memory operand. Find corresponding ideal mem node. // DecodeN node consumed by an address may have different type // then its input. Don't compare types for such case. // If the _leaf is an AddP, insert the base edge // Perform any 1-to-many expansions required // Remove old node from the graph // PhaseChaitin::fixup_spills will sometimes generate spill code // via the matcher. By the time, nodes have been wired into the CFG, // and any further nodes generated by expand rules will be left hanging // in space, and will not get emitted as output code. Catch this. // Also, catch any new register allocation constraints ("projections") // generated belatedly during spill code generation. // Record the con for sharing // 'op' is what I am expecting to receive // Operand type to catch childs result // This is what my child will give me. // Choose between operand class or not. // This is what I will receive. // New rule for child. Chase operand classes to get the actual rule. // Chain from operand or operand class, may be output of shared node "Bad AD file: Instruction chain rule must chain from operand");
// Insert operand into array of operands for this instruction // Chain from the result of an instruction // Now recursively walk the state tree & add operand list. for(
uint i=0; i<
2; i++ ) {
// binary tree // 'op' is what I am expecting to receive // Operand type to catch childs result // This is what my child will give me. // Choose between operand class or not. // This is what I will receive. // New rule for child. Chase operand classes to get the actual rule. // Insert operand into array of operands for this instruction }
else {
// Child is internal operand or new instruction // internal operand --> call ReduceInst_Interior // Interior of complex instruction. Do nothing but recurse. // instruction --> call build operand( ) to catch result // --> ReduceInst( newrule ) // This routine walks the interior of possible complex operands. // At each point we check our children in the match tree: // We are a leaf; add _leaf field as an input to the MachNode // (2) Child is an internal operand - // Skip over it ( do nothing ) // (3) Child is an instruction - // Call ReduceInst recursively and // and instruction as an input to the MachNode // Leaf? And not subsumed? assert(
mem == (
Node*)
1,
"multiple Memories being matched at once?" );
// Internal operand; recurse but do nothing else }
else {
// Child is a new instruction // Reduce the instruction, and add a direct pointer from this // machine instruction to the newly reduced one. // ------------------------------------------------------------------------- // Java-Java calling convention // (what you use when Java calls Java) //------------------------------find_receiver---------------------------------- // For a given signature, return the OptoReg for parameter 0. // Return argument 0 register. In the LP64 build pointers // take 2 registers, but the VM wants only the 'main' name. // A method-klass-holder may be passed in the inline_cache_reg // and then expanded into the inline_cache_reg and a method_oop register // defined in ad_<arch>.cpp //------------------------------find_shared------------------------------------ // Set bits if Node is shared or otherwise a root // Allocate stack of size C->unique() * 2 to avoid frequent realloc // Mark nodes as address_visited if they are inputs to an address expression // Flag as visited and shared now. // Node is shared and has no reason to clone. Flag it as shared. // This causes it to match into a register for the sharing. switch(
nop ) {
// Handle some opcodes special case Op_Phi:
// Treat Phis as shared roots case Op_Proj:
// All handled specially during matching // Convert (If (Bool (CmpX A B))) into (If (Bool) (CmpX A B)). Helps // with matching cmp/branch in 1 instruction. The Matcher needs the // Bool and CmpX side-by-side, because it can only get at constants // that are at the leaves of Match trees, and the Bool's condition acts continue;
// while (mstack.is_nonempty()) case Op_ConvI2D:
// These forms efficiently match with a prior case Op_ConvI2F:
// Load but not a following Store n->
outcnt() ==
1 &&
// Not already shared n->
outcnt() ==
1 )
// Not already shared case Op_BoxLock:
// Cant match until we get stack-regs in ADLC continue;
// while (mstack.is_nonempty()) set_shared(n);
// Force result into register (it will be anyways) case Op_ConP: {
// Convert pointers above the centerline to NUL case Op_ConN: {
// Convert narrow pointers above the centerline to NUL case Op_Binary:
// These are introduced in the Post_Visit state. // Do match stores, despite no ideal reg if( n->
is_Mem() ) {
// Loads and LoadStores // Loads must be root of match tree due to prior load conflict // Fall into default case for(
int i = n->
req() -
1; i >= 0; --i) {
// For my children Node *m = n->
in(i);
// Get ith input if (m ==
NULL)
continue;
// Ignore NULLs // Must clone all producers of flags, or we will not match correctly. // Suppose a compare setting int-flags is shared (e.g., a switch-tree) // then it will match into an ideal Op_RegFlags. Alas, the fp-flags // are also there, so we may match a float-branch to int-flags and // expect the allocator to haul the flags from the int-side to the continue;
// for(int i = ...) // Bases used in addresses must be shared but since // they are shared through a DecodeN they may appear // to have a single use so force sharing here. // Clone addressing expressions as they are "free" in memory access instructions // Some inputs for address expression are not put on stack // to avoid marking them as shared and forcing them into register // if they are used only in address expressions. // But they should be marked as shared if there are other uses // besides address expressions. // When there are other uses besides address expressions // put it on stack and mark as shared. // Intel, ARM and friends can handle 2 adds in addressing mode // AtomicAdd is not an addressing expression. // Cheap to find it by looking for screwy base. // Are there other uses besides address expressions? // Check for shift by small constant as well // Are there other uses besides address expressions? // Allow Matcher to match the rule which bypass // ConvI2L operation for an array index on LP64 // if the index value is positive. // Are there other uses besides address expressions? }
else {
// Sparc, Alpha, PPC and friends // Clone X+offset as it also folds into most addressing expressions continue;
// for(int i = ...) // We cannot remove the Cmp input from the Bool here, as the Bool may be // shared and all users of the Bool need to move the Cmp in parallel. // This leaves both the Bool and the If pointing at the Cmp. To // prevent the Matcher from trying to Match the Cmp along both paths // BoolNode::match_edge always returns a zero. // We reorder the Op_If in a pre-order manner, so we can visit without // accidentally sharing the Cmp (the Bool and the If make 2 users). n->
add_req( n->
in(
1)->
in(
1) );
// Add the Cmp next to the Bool // Now hack a few special opcodes switch( n->
Opcode() ) {
// Handle some opcodes special case Op_CMoveD:
// Convert trinary to binary-tree // Restructure into a binary tree for Matching. It's possible that // we could move this code up next to the graph reshaping for IfNodes // or vice-versa, but I do not want to debug this for Ladybird. }
// end of while (mstack.is_nonempty())// machine-independent root to machine-dependent root //---------------------------collect_null_checks------------------------------- // Find null checks in the ideal graph; write a machine-specific node for // it. Used by later implicit-null-check handling. Actually collects // either an IfTrue or IfFalse for the common NOT-null path, AND the ideal // During matching If's have Bool & Cmp side-by-side // Look for DecodeN node which should be pinned to orig_proj. // On platforms (Sparc) which can not handle 2 adds // in addressing mode we have to keep a DecodeN node and // use it to do implicit NULL check in address. // DecodeN node was pinned to non-null path (orig_proj) during // CastPP transformation in final_graph_reshaping_impl(). // Mark this as special case to distinguish from // a regular case: CmpP(DecodeN, NULL). //---------------------------validate_null_checks------------------------------ // Its possible that the value being NULL checked is not the root of a match // tree. If so, I cannot use the value in an implicit null check. // Note: new_val may have a control edge if // the original ideal node DecodeN was matched before // it was unpinned in Matcher::collect_null_checks(). // Unpin the mach node and mark it. // Is a match-tree root, so replace with the matched value // Yank from candidate list // Used by the DFA in dfa_xxx.cpp. Check for a following barrier or // atomic instruction acting as a store_load barrier without any // intervening volatile load, and thus we don't need a barrier here. // We retain the Node to act as a compiler ordering barrier. // Get the Proj node, ctrl, that can be used to iterate forward // We don't need current barrier if we see another or a lock // before seeing volatile load. // Op_Fastunlock previously appeared in the Op_* list below. // With the advent of 1-0 lock operations we're no longer guaranteed // that a monitor exit operation contains a serializing instruction. // We must retain this membar if there is an upcoming volatile // load, which will be preceded by acquire membar. // For other kinds of barriers, check by pretending we // are them, and seeing if we can be removed. // Delicate code to detect case of an upcoming fastlock block // The iff might be some random subclass of If or bol might be Con-Top // probably not necessary to check for these //============================================================================= //---------------------------State--------------------------------------------- //memset(_cost, -1, sizeof(_cost)); //memset(_rule, -1, sizeof(_rule)); //---------------------------dump---------------------------------------------- for(
int j = 0; j <
depth; j++ )
for(
int j = 0; j <
depth; j++ )