RecStudio Decompiler Design

Functions that use a frame pointer are easily recognizable, since they set up the frame in their prologue code (the code generated for the opening "{") and they remove the frame in their epilogue code (the code generated for the closing "}"). Different processors may even have specialized instructions to set up and remove the frame:

x86 frame-oriented instructions	68K frame-oriented instructions
ENTER ## LEAVE	LINK A6, ## UNLK A6	Set up the frame (function's "{") Remove the frame (function's "}")

However, it's perfectly legal for the compiler to use simpler instructions to set up the frame, and that in fact is what happens with modern compilers, which take into account the speed of execution of different instructions (for some reason Intel decided to not optimize the enter and leave instructions, preferring to optimize individual, simpler instructions). Therefore, a decompiler should be able to recognize any possible code sequence that sets up and removes the frame. Actually the above specialized instructions can be broken down into instructions that have the same behavior, as in the following table:

x86 frame-oriented sequences
PUSH EBP EBP = ESP ESP = ESP - ## ESP = EBP POP EBP	Set up the frame (function's "{") ## = size of locals Remove the frame (function's "}")

The translation of the specialized instructions into equivalent instruction sequences can be left to the instruction stream analyzer, so that the other parts of the decompiler won't ever have to see the specialized instructions.

Removing these instructions is important for a few motives:

these instructions cannot be directly generated by user code. They are implicitly generated by the function's { and }.
setting up the frame usually tells how much space is reserved for local variables, so it's useful to detect this information for the next phase (local variable replacement)
reducing the number of instructions, especially the instructions that remove the stack, open the door to further reductions, such as replacing a "goto" to the last block with a "return" statement.

The last case is exemplified in the following sequence of transformations:

   PUSH FP         PUSH(FP)          {                   {
   FP = SP         FP = SP
   ...             ...                  ...                 ...
   CMP R1,#0       if(R1 == 0)          if(R1 == 0)         if(R1 == 0)
   JEQ L2            goto L2              goto L2              return
   ...             ...                  ...                 ...
   CMP R2,#0       if(R2 > 0)           if(R2 > 0)          if(R2 > 0)
   JGT L2            goto L2              goto L2              return
   ...             ...                  ...                 ...
L2:            L2:                   L2:
   SP = FP         SP = FP
   POP FP          POP(FP)
   RET             return               return
                                     }                   }

with the removal of the L2 label and of 2 goto statements.

Since the stack pointer and the frame pointer are generic concepts, used by many compilers and processors, it makes sense to implement the removal of the prologue and epilogue in the processor-independent part of the decompiler. Code such as the following could be easily implemented:

    proc.isFramed = false // by default, not framed

    FPreg = processor.getFramePtrReg
    SPreg = processor.getStackPtrReg
    top = entry_block.firstInstruction
    bottom = exit_block.lastInstruction
    if bottom.op == Return
        bottom = bottom.previousInstruction
    topNode = top.expr
    bottomNode = bottom.expr
    if !topNode.isPush(FPreg) || !bottomNode.isPop(FPreg)
        return             // not a framed function

    top = top.nextInstruction
    entry_block.Remove(top.previousInstruction) // remove PUSH FP
    bottom = bottom.previousInstruction
    exit_block.Remove(bottom.nextInstruction)    // remove POP FP

    topNode = top.expr
    bottomNode = bottom.expr
    if !topNode.isAssignment(FPreg, SPreg) ||
       !bottomNode.isAssignment(SPreg, FPreg)
        return             // not a framed function

    top = top.nextInstruction
    entry_block.Remove(top.previousInstruction) // remove FP = SP
    exit_block.Remove(bottom)                    // remove SP = FP

    proc.isFramed = true
    proc.sizeOfLocals = 0
    topNode = top.expr
    if !topNode.isAssignment(SPreg) ||
       !topNode.op2.isSubtract(SPreg, ConstantValue)
        return             // framed, but no local variables
    proc.sizeOfLocals = -topNode.op2.value

Recognizing local variables

Having determined that a function uses the frame pointer automatically allows us to determine which variable is a local variable of the function and which is an incoming parameter (except for parameters passed in registers, which will be discussed later). Local variables are accessed at negative offsets from the frame pointer, while arguments are accessed at positive offsets:

R1 = *(FP + 4) // argument (e.g. "arg4")

R1 = *(FP - 12) // local variable (e.g. "local12")

By scanning the instruction sequence and replacing each access using the FP into a synthetic variable allows us to remove references to the frame pointer and "create" local variables or arguments:

    for each instr in proc
        replaceFramePtr(instr.expr)

replaceFramePtr(Node n)
{
    if n.op1 != null
        replaceFramePtr(n.op1)
    if n.op2 != null
        replaceFramePtr(n.op2)
    if n.isAdd(FPreg, ConstantValue)
        name = "arg" + n.op2.value
        var = proc.arguments.find(name)
        if var == null
            var = new Variable(name)
            proc.arguments.add(var)
        n = var           // replace (FP + #) with var
    else if n.isSub(FPreg, ConstantValue)
        name = "local" + n.op2.value
        var = proc.locals.find(name)
        if var == null
            var = new Variable(name)
            proc.locals.add(var)
        n = var           // replace (FP - #) with var
}

Note one wrinkle in the above code: memory references, such as (FP + 4) (which are addresses), are being converted to named variables (which by default are values)! This is incorrect, as an implicit indirection is added when accessing a named variable. Therefore we need to add an additional "AddressOf" operator to keep the semantic of the original instructions, in case the original instructions were computing the address of a local variable or argument. In other words:

     (FP + 4)    ==                   &arg4
    *(FP + 4)    == *(&arg4)      == arg4
     (FP - 12)   ==                   &local12
    *(FP - 12)   == *(&local12)   == local12

Thus the above n = var statements should be replaced with:

n = new Node(AddressOf)
n.op1 = var

The sequence *(&var) is replaced with var by a clean-up pass that simplifies expressions to make them better-looking when emitted to the decompiled output.

Other resources:
http://unixwiz.net/techtips/win32-callconv-asm.html

Next: Frame-less functions

Decompiler Design - Stack Frames

Stack Frames

Recognizing local variables