|
Prev: Control Flow Graph
Loop Analysis
In the previous page we noticed that we can convert
particular configurations of basic blocks into one
the "if-then-(else)" conditional statement.
All "if" statements have forward jumps, that is
the destination of the jump(s) is some code that
has not been executed, yet.
Compilers use conditional statements to control
another high-level construct: these are statements
that implement loops. In this case, the destination
of the jump is code that has already been executed
at least once.
We could simply leave the conditional statements
as "if-goto" pairs, as in the pictures on the previous
page, since that's what the processor really executes,
(processors don't know about loops except when using
very specific instructions, such as the REP prefix in x86),
but that would not make clear what the code does,
as the reader would have to reconstruct the loop
by checking the destination of the goto.
So we'll try to discover the loops from the configuration
of the basic blocks.
Every loop can be described with a "do-while" statement.
The "while" loop (pre-tested loop) is a
special case of a "do-while" loop, where the bottom
condition is always true, and the first statement of
the loop is an "if" that jumps out of the loop.
The "for" loop is a special case of the "while" loop.
Lhead:
do {
statements;
Lcontinue:
} while(cond);
Lbreak:
|
Lhead:
do {
Lcontinue:
if(cond)
goto Lbreak;
statements;
} while(true);
Lbreak:
|
Lhead:
var = init_expr;
if(cond)
goto Lbreak;
do {
statements;
Lcontinue:
var += incr_expr;
} while(cond);
|
To identify loops we need to know where the
loop is entered, so that we can place the "do"
statement there, and where the loop ends,
so that all statements belonging to the
loop can be placed inside the "do" statement.
The entry point of a loop is in what is called
the "header" block of the loop. All execution
paths must go through the header block to enter
the loop (i.e. we don't consider loops with
multiple entry points, although these are possible).
In order to find loops, we first compute
the set of "dominators" for each block in the
control flow graph.
We're not going to describe the theoretical
foundation of dominators. Plenty of resources
are available on the Internet.
Instead, we're showing the practical implementation
of the algorithm that finds the set of
dominators for each block.
First we use a bit vector where each bit
corresponds to a basic block in the control flow graph.
We associate one such bit vector to each basic block.
Then we use bit-wise and and or operations
to reduce the number of bits that are set in
each vector, until there are no more changes.
Here's the algorithm:
ComputeDominators()
{
int nBlocks = block_list.size();
int i = 0;
for each block in block_list {
block->id = i++;
block->dominators = new BitVector(nBlocks)
block->dominators->SetAllBitsTo(1)
}
block = block_list[0]
block->dominators->SetAllBitsTo(0)
block->dominators->SetBit(block->id)
BitVector T(nBlocks)
do {
changed = false
for each block in block_list {
if (block == entry_block)
continue
for each pred in block->predecessors {
T.SetAllBitsTo(0)
T.Or(block->dominators)
block->dominators->And(pred->dominators)
block->dominators->SetBit(block->id)
if (block->dominators != T)
changed = true
}
}
} while (changed);
}
Based on the dominators information we can
compute the set of basic blocks that belong
to each loop using the following two functions:
NaturalLoopForEdge(Block header, Block tail)
{
Stack workList;
Loop loop;
loop = new Loop();
loop->header = header
loop->blocks += header
if header != tail {
loop->blocks += tail
workList += tail
}
while workList not empty {
block = workList.pop
for each pred in block->predecessors {
if not loop->find(pred) {
loop->blocks += pred
workList += pred
}
}
}
return loop
}
ComputeNaturalLoops()
{
LoopSet loopSet;
for each block in block_list {
if (block == entry_block)
continue
for each succ in block->successors {
// Every successor that dominates its predecessor
// must be the header of a loop.
// That is, block -> succ is a back edge.
if block->dominators->find(succ)
loopSet += NaturalLoopForEdge(succ, block);
}
}
}
The algorithm work like this: first we find the back
edges, that is places where execution goes back to a
previously visited block. This is done by checking if
the successor of a block dominates the block itself.
The successor block will be the header of the loop.
Then, since to reach the (tail) block execution must go
through its dominator, we can traverse the
predecessors of each block starting from the tail,
until we reach the header block (which is always in the
loop since we put it there at the beginning of
NaturalLoopForEdge() ).
|