REC Studio 4 - Examples



     

 
 Content

Home
Downloads
Screenshots
Examples
How To Use
Design


Last update: March 1, 2011

     

No decompiler will produce perfect, recompilable code except in the most fortunate cases, for example when full symbolic information is available in the executable file. Therefore decompilation is an iterative process that continues until the user is satisfied with the result produced by the decompiler.

The end goal may not be to recompile the decompiled output, but it may be simply to understand how one portion of a file works or to see if there is any malicious code in the application (for security assessment purposes).

Rec Studio uses a number of algorithms that may produce output of varying levels of accuracy. Here is an example of the main loop of a Windows Hello World application:

L00401000(_unknown_ r7, struct HINSTANCE__* _a4)
{
    intOrPtr _v16;
    char _v68;
    intOrPtr _v76;
    struct tagMSG _v100;
    _unknown_ r1;
    _unknown_ r4;
    _unknown_ r5;
    _unknown_ r6;
    _unknown_ _t10;
    _unknown_ _t11;
    _unknown_ _t13;
    struct HACCEL__* _t14;
    int _t15;
    int _t18;
    int _t20;
    _unknown_ _t21;
    _unknown_ _t22;
    _unknown_ _t23;
    _unknown_ _t24;
    _unknown_ _t29;
    _unknown_ _t30;
    _unknown_ _t31;
    struct HINSTANCE__* _t32;
    struct HACCEL__* _t33;
    _unknown_ _t34;
    _unknown_ _t35;

    r7 = r7;
    _t32 = _a4;
    _push(_t29);
    LoadStringA(_t32, 103,  &M004054F4, 100);
    LoadStringA(_t32, 109,  &M00405490, 100);
    _push(_t32);
    L004010C0(r7);
    _t13 = L00401150(r7, _t32, _v16);
    r7 = r7 + 4;
    if(_t13 != 0) {
        _t14 = LoadAcceleratorsA(_t32, 109);
        _push(0);
        _push(0);
        _push(0);
        _push( &_v68);
        _t33 = _t14;
        _t15 = GetMessageA();
        if(_t15 != 0) {
            _push(_t23);
            _push(_t34);
            do {
                _t18 = TranslateAcceleratorA(_v100.time, _t33,  &(_v100.time));
                if(_t18 == 0) {
                    TranslateMessage( &(_v100.message));
                    DispatchMessageA( &(_v100.hwnd));
                }
                _t20 = GetMessageA( &(_v100.message), 0, 0, 0);
            } while(_t20 != 0);
            _pop(r6);
            _pop(r1);
        }
        _pop(r4);
        return _v76;
    } else {
        _pop(r4);
        return _t13;
    }
}

This program did not have any symbolic information available, since it was compiled in Release mode. Nevertheless Rec Studio was able to identify the main loop and the arguments passed to most functions.

The output includes a number of local variables that were clearly missing from the original source code. These variables are "temporaries" that could not be safely eliminated in the generation of the final code and are a relic of the partial Single Static Assignment algorithm used to "uncolor" registers. The larger and more optimized is the procedure being decompiled, the more likely these temporary variables will be generated (sometimes in the number of many dozens). Other times, the decompiler will be able to eliminate most if not all of these variables.

The decompiler is able to assign types to variables if these are passed as parameters to library functions, such as in the case of TranslateMessage() in the code above. Type information is propagated to other variables as much as possible, although intra-procedural type propagation is not implemented yet.

You will also notice that sometimes the decompiler is not able to assign the parameters to a function call, such as in this sequence:

        _push(0);
        _push(0);
        _push(0);
        _push( &_v68);
        _t33 = _t14;
        _t15 = GetMessageA();

This is because the "_t33 = _t14" assignment is in between the call and the push of the parameters, and the decompiler cannot yet determine if the assignment has any side effect on the arguments of the push instructions (e.g. if it modifies a location that was pushed previously). In such cases the decompiler will keep the less correct but more obvious code so that the user is aware of what is going on.

In some cases the decompiler cannot safely combine statements into high-level constructs such as for() and while(), but it keeps the simplest form of control flows of if-gotos, as in the following excerpt:

L3:
    _t6 =  *((signed char*)(_t9 + 1));
    _t9 = _t9 + 1;
    if(_t6 == 34 || _t6 == 0) {
        goto L7;
    }
L5:
    _t7 = _t6 & 255;
    _push(_t7);
    L004020E8(r7);
    _t14 = _t7;
    _pop(r2);
    if(_t14 != 0) {
        _t9 = _t9 + 1;
    }
    goto L3;
L7:
    if( *_t9 != 34) {
        goto L11;
    } else {
        goto L8;
    }
L8:

In this case, L5 is the body of a while() loop, but because of the 2 assignments at the beginning of L3, the decompiler could not collaps the if() in the L3 block into a while() statement. The quality of the control flows generated varies greatly depending on the complexity and the level of optimization of the compiled code. Still, Rec Studio many times is able to reconstruct complex control flow sequences, including switch()/case statements.


Copyright © 1997 - 2011 Backer Street Software - All rights reserved.