UnifiedVariables

This page will describe the new "unified variables" IR of CACAO. The page will explain the data structures used in each step and it will give proofs that the IR and its transformations are sound.

Terms used in this page

javalocal

an untyped local variable as seen by the bytecode. Internally CACAO may split up a javalocal into several typed local variables.

JavaPC
an absolute bytecode index, eg. the target of a GOTO bytecode after making it absolute.


Data available before PARSE

maxstack
maximum stack depth in this method
maxlocals
maximum number of javalocals used in this method
bytecode
array of bytes representing the java instructions
exception handler ranges
as bytecode indices


Data available after PARSE

basic blocks
array of basic blocks
instructions
list of ICMDs for each block, with constants
exception handler ranges
refering to blocks
varcount
the maximum number of variables necessary to represent this method (without JSR inlining)
vars
variables Vi in this method
local_map
map from javalocals (index,type) to a variable indices
localcount
number of variables that represent locals

instructions: The ICMDs contain the opcode and flags of the instruction. They also contain constant values, eg. constants from the constant pool, method references, ... They do not contain explicit variable operands. /!\ branch targets are given as JavaPCs.

vars: The variables are allocated in one continuous array. The variables representing locals are located continuously at the start of the vars array.

basicblock_index: A table mapping JavaPC values to basic block indices.


Internal Data of STACKANALYSE

stack
the stack representation
last_store_boundary
for each javalocal index the stack boundary before the last store

of a variable with this index

new
pointer to the next free stackslot in the array.
coalescing_boundary
the stack boundary back to which coalescing of temporary variables

with local variables may be performed

Stack Representation

The stack representation is a tree, the root of which represents the empty stack. A node of this tree, except the root, is called a stackslot. Stackslots are allocated in a continuous array in DEF-order with respect to the ICMDs list before STACKANALYSE.

The Coalescing Boundary

coalescing_boundary is an index into the array of stackslots. Stackslots with an index < coalescing_boundary are said to be before the coalesing boundary. Stackslots with an index >= coalescing_boundary are said to be after the coalescing boundary.

At the beginning of a basic block B we set coalescing_boundary = B->indepth. Thus all invars of the block B are before the coalescing_boundary. This avoids coalescing of invars with local variables.

While processing the basic block, the coalescing_boundary is changed in the following cases:

The coalescing_boundary is never decreased while analysing a basic block.

Retroactive changes

Usually the variable index of a stackslot and the operands of an instruction are set when the instruction is processed. There are cases, however, that require retroactive changing of stackslots and operands.

Note an important difference between A and B: Case A is an optimization, while case B has to be done to ensure correctness.

Corollary: An operand USEd in a consuming way does never have to be changed retroactively. (Because it is - by definition - not live at any later instruction, and all issues up to the current instruction have been considered already, so it is valid there.)

Corollary: An invar is never retroactively changed. Invars are always before the coalescing boundary, so case A never arises. Invars are not created by LOADs, so case B also never arises.

Regarding destination operands: Since any live non-invar stack slot has exactly one DEF in the block, we can store a reference to the DEFining instruction in the stackslot (stackslot->creator). Thus when we have to retroactively change the stackslot, we know which instruction must have its destination operand changed.

So the only hard problems with retroactive changes are posed by non-consuming USEs and pass-through operands.


Data available after STACKANALYSE

basic blocks
array of basic blocks
instructions
list of ICMDs for each block, with constants and variable operands. Branch targets as basicblock pointers.
exception handler ranges
refering to blocks
varcount
the maximum number of variables necessary to represent this method
vars
variables Vi in this method
local_map
map from javalocals (index,type) to a variable indices

instructions: The ICMDs now explicitely refer to the variables Vi they use.

Variables

A variable may be local variable or a block variable:

{i} VARIANT2 has been chosen.

Constraints on block variables (VARIANT1 - DUP coalescing in STACKANALYSE):

Constraints on block variables (VARIANT2 - DUP coalescing in SIMPLEREG):

Notes:


Internal Data of SIMPLEREG

{i} VARIANT2 has been chosen.

VARIANT1 - DUP coalescing in STACKANALYSE:

VARIANT2 - DUP coalescing in SIMPLEREG:

regcopycount
An array, holding for each hardware registers the number of times the register has been DUPed.
memcopycount
An array, holding for each memory slot the number of times it has been DUPed.

cacaowiki: UnifiedVariables (last edited 2006-10-07 13:08:54 by EdwinSteiner)