Intermediate Representation
This should become a simplified semi-formal description of the cacao IR.
Method
The IR of a method consists of:
- a list of local variables L0, L1, ..., Lnlocals-1
- a list of interface variables I0, I1, ..., Ininterfaces-1 (only when simplereg is used)
- an implicit method prolog
- a list of basic blocks
- an implicit method epilog
Basic Block
A basic block consists of:
- a block type, which is STD, EXH, or SBR.
- a list of temporary variables T0, T1, ..., Tntemps-1. Each Tj may be unmarked, or marked as STACKVAR(x), LOCALVAR(x), or ARGVAR(x), where x is an integer.
- a list of variable references IN0, IN1, ..., INnin-1
- an implicit block prolog
- a list of insructions (ICMDs)
- an implicit block epilog
- a list of variable references OUT0, OUT1, ..., OUTnout-1
Constraints on the instruction list (basic block properties):
- No control-flow edges may enter the block, except at the beginning.
- No non-exception control-flow edges may leave the block, except at the end.
Temporary variables Tj:
SSA temporaries: For each Tj there may be at most one DEF Tj in the block.
Array allocation in DEF-order: Given two temporary variables Ti and Tj: (i < j) <==> the DEF Ti precedes the DEF Tj in the block.
Nested liferanges: Liferanges of temporaries are properly nested. This means that if the DEF Ti precedes the DEF Tj, then the liferange of Tj ends before the liferange of Ti.
Constraints on variable references:
- The INi may only reference the Tj of this block.
- The OUTi may only reference the Tj of this block.
- No other block may reference any of the Tj of this block.
- If there is a non-exception edge from block A to block B, then nout_A == nin_B (XXX needs clarification for JSR).
STACKVARs:
- A temporary variable Tj that is referenced as INi, OUTi, or both, may be marked as STACKVAR(i) indicating that it has been coalesced with the interface variable Ii.
(See BaselineCopyElimination.)
ARGVARs:
- A temporary variable Tj may be marked ARGVAR(x), which means it has been precolored to the argument register x. A special case of this is ARGVAR(-1), which means that the temporary variable has been precolored to the return value register of the architecture.
LOCALVARs:
- A temporary variable Tj may be marked LOCALVAR(x), which means it has been coalesced
with the local variable Lx. (See BaselineCopyElimination.)
Liveness:
- Local variables Li may be live-in, live-out, or both at the instruction list.
- The variables referenced by the INi may be live-in at the instruction list.
- The variables referenced by the OUTi may be live-out at the instruction list.
- No other variables are allowed to be live-in or live-out at the instruction list.
Coalescing of IN and OUT variables (for SSA)
For SSA, whenever there is a non-exception edge from Block A to Block B:
for all i in {0, 1, ..., nout_A-1 (== nin_B-1)}:
coalesce *OUTi_A with *INi_B
NOTE: It must be guaranteed that OUTi_A and INi_B (for each i), do not interfere with each other.
A problem can arise, for example, if there is a loop in the CFG that leads to two interfering variables to be (transitively) coalesced, because they appear in the IN/OUT references.
- This problem is avoided by a property of the stack model: Each variable has an associated stackdepth, which is well-defined over the whole lifetime of the variable (ie. it never changes). For each variable referenced by INi or OUTi the stackdepth is i. As variables of the same stackdepth can never interfere, it is guaranteed that coalescing IN/OUT variables with the same i is legal.
We must be careful to keep this property when we move away from the stack model.
Block Prolog (STD blocks)
For simplereg:
for i in {0, 1, ..., nin-1}:
*INi <- Ii
- NOTE: It is tried to optimize away these moves by coalescing the *INi with the Ii. This is implemented by marking the *INi as STACKVAR(i) instead of a TEMPVAR.
Block Prolog (EXH blocks)
Block Prolog (SBR blocks)
Block Epilog
For simplereg:
for i in {0, 1, ..., nout-1}:
Ii <- *OUTi
- NOTE: It is tried to optimize away these moves by coalescing the *OUTi with the Ii. This is implemented by marking the *OUTi as STACKVAR(i) instead of a TEMPVAR.
For SSA:
Phi-moves to the successor's in-variables. (Needs to be specified.)
Instruction
An instruction (ICMD) consists of:
- an integer opcode
- zero to 255 source variable references S1, S2, ..., Snsources
- zero to 6 destination variable references D1, D2, ..., Dndests
Constraints:
- A source variable reference Si refers to a variable that is
- live-in at the instruction. (It may also be live-out.)
- A destination variable reference Di refers to a variable that
- is live-out at the instruction. (It may also be live-in.)
NOTE: Only the DUP_* instructions need more than one destination.