Syntax.doc revision 605
#
# Copyright 1997-1998 Sun Microsystems, Inc. All Rights Reserved.
# DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.
#
# This code is free software; you can redistribute it and/or modify it
# under the terms of the GNU General Public License version 2 only, as
# published by the Free Software Foundation.
#
# This code is distributed in the hope that it will be useful, but WITHOUT
# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
# version 2 for more details (a copy is included in the LICENSE file that
# accompanied this code).
#
# You should have received a copy of the GNU General Public License version
# 2 along with this work; if not, write to the Free Software Foundation,
# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA.
#
# Please contact Sun Microsystems, Inc., 4150 Network Circle, Santa Clara,
# CA 95054 USA or visit www.sun.com if you need additional information or
# have any questions.
#
#
JavaSoft HotSpot Architecture Description Language Syntax Specification
Version 0.4 - September 19, 1997
A. Introduction
This document specifies the syntax and associated semantics for the JavaSoft
HotSpot Architecture Description Language. This language is used to describe
the architecture of a processor, and is the input to the ADL Compiler. The
ADL Compiler compiles an ADL file into code which is incorporated into the
Optimizing Just In Time Compiler (OJIT) to generate efficient and correct code
for the target architecture. The ADL describes three bassic different types
of architectural features. It describes the instruction set (and associated
operands) of the target architecture. It describes the register set of the
target architecture along with relevant information for the register allocator.
Finally, it describes the architecture's pipeline for scheduling purposes.
The ADL is used to create an architecture description file for a target
architecture. The architecture description file along with some additional
target specific oracles, written in C++, represent the principal effort in
porting the OJIT to a new target architecture.
B. Example Syntax
1. Instruction/Operand Syntax for Matching and Encoding
// Create a cost attribute for all operands, and specify the default value
op_attrib op_cost(10);
// Create a cost attribute for all instruction, and specify a default value
ins_attrib ins_cost(100);
// example operand form
operand x_reg(REG_NUM rnum)
%{
constraint(IS_RCLASS(rnum, RC_X_REG));
match(VREG) %{ rnum = new_VR(); %} // block after rule is constructor
encode %{ return rnum; %} // encode rules are required
// this operand has no op_cost entry because it uses the default value
%}
// example instruction form
instruct add_accum_reg(x_reg dst, reg src)
%{
match(SET dst (PLUS dst src)); // no block = use default constructor
encode // rule is body of a C++ function
%{
return pentium_encode_add_accum_reg(rnum);
%}
ins_cost(200); // this instruction is more costly than the default
%}
2. Register Set Description Syntax for Allocation
reg_def AX(SOE, 1); // declare register AX, mark it save on entry with index 0
reg_def BX(SOC); // declare register BX, and mark it save on call
reg_class X_REG(AX, BX); // form a matcher register class of X_REG
// these are used for constraints, etc.
alloc_class class1(AX, BX); // form an allocation class of registers
// used by the register allocator for separate
// allocation of target register classes
3. Pipeline Syntax for Scheduling
C. Keywords
a. instruct - indicates a machine instruction entry
b. operand - indicates a machine operand entry
c. opclass - indicates a class of machine operands
d. source - indicates a block of C++ source code
e. op_attrib - indicates an optional attribute for all operands
f. ins_attrib - indicates an optional attribute for all instructions
g. match - indicates a matching rule for an operand/instruction
h. encode - indicates an encoding rule for an operand/instruction
i. predicate - indicates a predicate for matching operand/instruction
*j. constraint - indicates a constraint on the value of an operand
*k. effect - describes the dataflow effect of an operand which
is not part of a match rule
*l. expand - indicates a single instruction for matching which
expands to multiple instructions for output
*m. rewrite - indicates a single instruction for matching which
gets rewritten to one or more instructions after
allocation depending upon the registers allocated
*n. format - indicates a format rule for an operand/instruction
o. construct - indicates a default constructor for an operand
[NOTE: * indicates a feature which will not be in first release ]
2. Register
*a. register - indicates an architecture register definition section
*b. reg_def - indicates a register declaration
*b. reg_class - indicates a class (list) of registers for matching
*c. alloc_class - indicates a class (list) of registers for allocation
3. Pipeline
*a. pipeline - indicates an architecture pipeline definition section
*b. resource - indicates a pipeline resource used by instructions
*c. pipe_desc - indicates the relevant stages of the pipeline
*d. pipe_class - indicates a description of the pipeline behavior
for a group of instructions
*e. ins_pipe - indicates what pipeline class an instruction is in
D. Delimiters
1. Comments
a. /* ... */ (like C code)
b. // ... EOL (like C++ code)
c. Comments must be preceeded by whitespace
2. Blocks
a. %{ ... %} (% just distinguishes from C++ syntax)
b. Must be whitespace before and after block delimiters
3. Terminators
a. ; (standard statement terminator)
b. %} (block terminator)
c. EOF (file terminator)
4. Each statement must start on a separate line
5. Identifiers cannot contain: (){}%;,"/\
E. Instruction Form: instruct instr1(oper1 dst, oper2 src) %{ ... %}
1. Identifier (scope of all instruction names is global in ADL file)
2. Operands
a. Specified in argument style: (<type> <name>, ...)
b. Type must be the name of an Operand Form
c. Name is a locally scoped name, used for substitution, etc.
3. Match Rule: match(SET dst (PLUS dst src));
a. Parenthesized Inorder Binary Tree: [lft = left; rgh = right]
(root root->lft (root->rgh (root->rgh->lft root->rgh->rgh)))
b. Interior nodes in tree are operators (nodes) in abstract IR
c. Leaves are operands from instruction operand list
d. Assignment operation and destination, which are implicit
in the abstract IR, must be specified in the match rule.
4. Encode Rule: encode %{ return CONST; %}
a. Block form must contain C++ code which constitutes the
body of a C++ function which takes no arguments, and
returns an integer.
b. Local names (operand names) are can be used as substitution
symbols in the code.
5. Attribute (ins_attrib): ins_cost(37);
a. Identifier (must be name defined as instruction attribute)
b. Argument must be a constant value or a C++ expression which
evaluates to a constant at compile time.
*6. Effect: effect(src, OP_KILL);
a. Arguments must be the name of an operand and one of the
pre-defined effect type symbols:
OP_DEF, OP_USE, OP_KILL, OP_USE_DEF, OP_DEF_USE, OP_USE_KILL
*7. Expand:
a. Parameters for the new instructions must be the name of
an expand rule temporary operand or must match the local
operand name in both the instruction being expanded and
the new instruction being generated.
instruct convI2B( xRegI dst, eRegI src ) %{
match(Set dst (Conv2B src));
expand %{
eFlagsReg cr;
loadZero(dst);
testI_zero(cr,src);
set_nz(dst,cr);
%}
%}
// Move zero into a register without setting flags
instruct loadZero(eRegI dst) %{
effect( DEF dst );
format %{ "MOV $dst,0" %}
opcode(0xB8); // +rd
ins_encode( LdI0(dst) );
%}
*8. Rewrite
9. Format: format(add_X $src, $dst); | format %{ ... %}
a. Argument form takes a text string, possibly containing
some substitution symbols, which will be printed out
to the assembly language file.
b. The block form takes valid C++ code which forms the body
of a function which takes no arguments, and returns a
pointer to a string to print out to the assembly file.
Mentions of a literal register r in a or b must be of
the form r_enc or r_num. The form r_enc refers to the
encoding of the register in an instruction. The form
r_num refers to the number of the register. While
r_num is unique, two different registers may have the
same r_enc, as, for example, an integer and a floating
point register.
F. Operand Form: operand x_reg(REG_T rall) %{ ... %}
1. Identifier (scope of all operand names is global in ADL file)
2. Components
a. Specified in argument style: (<type> <name>, ...)
b. Type must be a predefined Component Type
c. Name is a locally scoped name, used for substitution, etc.
3. Match: (VREG)
a. Parenthesized Inorder Binary Tree: [lft = left; rgh = right]
(root root->lft (root->rgh (root->rgh->lft root->rgh->rgh)))
b. Interior nodes in tree are operators (nodes) in abstract IR
c. Leaves are components from operand component list
d. Block following tree is the body of a C++ function taking
no arguments and returning no value, which assigns values
to the components of the operand at match time.
4. Encode: encode %{ return CONST; %}
a. Block form must contain C++ code which constitutes the
body of a C++ function which takes no arguments, and
returns an integer.
b. Local names (operand names) are can be used as substitution
symbols in the code.
5. Attribute (op_attrib): op_cost(5);
a. Identifier (must be name defined as operand attribute)
b. Argument must be a constant value or a C++ expression which
evaluates to a constant at compile time.
6. Predicate: predicate(0 <= src < 256);
a. Argument must be a valid C++ expression which evaluates
to either TRUE of FALSE at run time.
*7. Constraint: constraint(IS_RCLASS(dst, RC_X_CLASS));
a. Arguments must contain only predefined constraint
functions on values defined in the AD file.
b. Multiple arguments can be chained together logically
with "&&".
8. Construct: construct %{ ... %}
a. Block must be a valid C++ function body which takes no
arguments, and returns no values.
b. Purpose of block is to assign values to the elements
of an operand which is constructed outside the matching
process.
c. This block is logically identical to the constructor
block in a match rule.
9. Format: format(add_X $src, $dst); | format %{ ... %}
a. Argument form takes a text string, possibly containing
some substitution symbols, which will be printed out
to the assembly language file.
b. The block form takes valid C++ code which forms the body
of a function which takes no arguments, and returns a
pointer to a string to print out to the assembly file.
Mentions of a literal register r in a or b must be of
the form r_enc or r_num. The form r_enc refers to the
encoding of the register in an instruction. The form
r_num refers to the number of the register. While
r_num is unique, two different registers may have the
same r_enc, as, for example, an integer and a floating
point register.
G. Operand Class Form: opclass memory( direct, indirect, ind_offset);
H. Attribute Form (keywords ins_atrib & op_attrib): ins_attrib ins_cost(10);
1. Identifier (scope of all attribute names is global in ADL file)
2. Argument must be a valid C++ expression which evaluates to a
constant at compile time, and specifies the default value of
this attribute if attribute definition is not included in an
I. Source Form: source %{ ... %}
1. Source Block
a. All source blocks are delimited by "%{" and "%}".
b. All source blocks are copied verbatim into the
C++ output file, and must be valid C++ code.
Mentions of a literal register r in this code must
be of the form r_enc or r_num. The form r_enc
refers to the encoding of the register in an
instruction. The form r_num refers to the number of
the register. While r_num is unique, two different
registers may have the same r_enc, as, for example,
an integer and a floating point register.
J. *Register Form: register %{ ... %}
1. Block contains architecture specific information for allocation
2. Reg_def: reg_def reg_AX(1);
a. Identifier is name by which register will be referenced
throughout the rest of the AD, and the allocator and
back-end.
b. Argument is the Save on Entry index (where 0 means that
the register is Save on Call). This is used by the
frame management routines for generating register saves
and restores.
3. Reg_class: reg_class x_regs(reg_AX, reg_BX, reg_CX, reg_DX);
a. Identifier is the name of the class used throughout the
instruction selector.
b. Arguments are a list of register names in the class.
4. Alloc_class: alloc_class x_alloc(reg_AX, reg_BX, reg_CX, reg_DX);
a. Identifier is the name of the class used throughout the
register allocator.
b. Arguments are a list of register names in the class.
K. *Pipeline Form: pipeline %{ ... %}
1. Block contains architecture specific information for scheduling
2. Resource: resource(ialu1);
a. Argument is the name of the resource.
3. Pipe_desc: pipe_desc(Address, Access, Read, Execute);
a. Arguments are names of relevant phases of the pipeline.
b. If ALL instructions behave identically in a pipeline
phase, it does not need to be specified. (This is typically
true for pre-fetch, fetch, and decode pipeline stages.)
c. There must be one name per cycle consumed in the
pipeline, even if there is no instruction which has
significant behavior in that stage (for instance, extra
stages inserted for load and store instructions which
are just cycles which pass waiting for the completion
of the memory operation).
4. Pipe_class: pipe_class pipe_normal(dagen; ; membus; ialu1, ialu2);
a. Identifier names the class for use in ins_pipe statements
b. Arguments are a list of stages, separated by ;'s which
contain comma separated lists of resource names.
c. There must be an entry for each stage defined in the
pipe_desc statement, (even if it is empty as in the example)
and entries are associated with stages by the order of
stage declarations in the pipe_desc statement.