RecStudio Decompiler Design - Frame-less Functions

Dealing with framed functions makes it easy for a decompiler to identify local variables and parameters. Unfortunately, optimizing compiler don't generate a frame pointer for most functions. After all, the main reason for using a frame pointer is to help a debugger to traverse the list of active functions to tell the user how the current function was reached, and how to easily locate local variables and parameters (as well as registers that are saved across function calls).

The reason that a compiler can forget about the frame pointer is that the compiler knows exactly how to access each local variable or parameter at every point in the function. So, let's see the difference in code generated between framed functions and frame-less functions:

PUSH FP                      SP -= #local_size        {
FP = SP
SP -= #local_size                                        int var1;
...                          ...
// access local var1
R1 = FP[-8]                  R1 = SP[#local_size-8]
PUSH R1                      PUSH R1
// access local var1 again
R2 = FP[-8]                  R2 = SP[#local_size-8+4]
PUSH R2                      PUSH R2
func()                       func()                     func(var1,var1);
// deallocate parameters
SP += 8                      SP += 8
...                          ...
SP = FP                      SP += #local_size        }
POP FP
RET                          RET

Detecting the size of the locals is fairly easy even for frame-less functions. Just look for the allocation of the local variables area by the "SP -= #local_size" instruction in the prologue of the function, and a corresponding de-allocation of the local variables in the epilogue of the function.

But detecting local variables is more difficult. When using a frame pointer, the expression used to access a local variable or parameter is always the same, regardless of how many parameters are being passed to a function being called, because the frame pointer register is not affected by the PUSH instructions used to pass the parameters.

This is not true for frame-less functions: the expression used to access the same local depends on how the stack pointer has changed since the start of the function. This is because the stack pointer is used both to access the variables and also to pass parameters to other functions. Therefore, for frame-less functions, a decompiler must compute a stack pointer adjustment factor for each instruction, to be added to the initial location of each local variable.

Note that this has to happen even when computing the location of the variables themselves!

So we cannot use the same algorithm we used to create local variables and parameters for framed functions in the presence of frame-less functions. For example:

Instruction stream      SP adjustment   variable      var. location
SP -= #local_size            0                        SP's value = &CFA
R1 = SP[#local_size-8]       0          var1         CFA - 8
PUSH R1
R1 = SP[#local_size-4]       4          var1         CFA - 8
PUSH R1
R1 = SP[#local_size-4]       8          var2         CFA - 12
...
SP += 8
R1 = SP[#local_size-8]       0          var1         CFA - 8

We use here the concept of CFA, as defined in the DWARF debug format, to mean the value of the stack pointer at the entry point of the function. This is the value we use as reference throughout the function, and the value we use to identify variables locations. So we store var1 as having location "CFA - 8", and var2 as having location "CFA - 12". With this approach, we can easily determine whether a variable is a local variable or an incoming parameter, since the value of the offset to CFA is negative for local variables, and positive for parameters, just as we had for the framed function case.

Converting locations in the instruction stream is easily performed with the following algorithm:

   sp_adjust = 0
   for each instruction i in block
       if i.opcode == Push
          sp_adjust += i.operand_size
       if i.opcode == Pop
          sp_adjust -= i.operand_size
       if isSubFromSP(i)
          if block == entry_block
              local_size = i.op2.value
          else
              sp_adjust -= i.op2.value
       if isAddToSP(i)
          sp_adjust += i.op2.value

       i.result = replaceSPreferences(i.result)
       i.op1 = replaceSPreferences(i.op1)
       i.op2 = replaceSPreferences(i.op2)
   end for

Node replaceSPreferences(Node n)
{
   if n == null
      return n
   n.op1 = replaceSPreferences(n.op1)
   n.op2 = replaceSPreferences(n.op2)
   if n.isAddToSP()
      n = new AddNode(CFA, local_size - sp_adjust)
   if n.isSubToSP()
      n = new SubNode(CFA, local_size - sp_adjust)
   return n
}

This algorithm is not perfect, as it makes a few assumptions:

it assumes that the value of SP at the entry of each basic block is the same, that is that all bytes allocated inside a basic block are deallocated before exiting the basic block. This may not always be true, so the adjustment values computed for a basic block should be propagated to the successors of the block.
it assumes that a function call does not change the value of the stack pointer after the function has returned. This is not always true, especially in x86, where some calling conventions allow the called function to de-allocate its own parameters. In order to handle this case, we need to know the calling convention for the called function and change the sp adjustment value accordingly.
it assumes that all computations occur on the SP register. If the SP register is copied to another register, say R1, then R1 is modified and then copied back to SP, we would have failed to track the new value of the SP. This may happen in the case of the original function using the "alloca(size)" pseudo-function.

Due to this limitations, it is not always possible to reconstruct local variables and parameters for frame-less functions. Fortunately, this kind of function is not common (except maybe in gcc, which makes plenty of use of "alloca(size)" calls), so it's possible to mark these functions and ask the user to provide additional information interactively.

Next: Saved Register