REC Studio 4 - Reverse Engineering Compiler

 


 

 
Last update:
November 16, 2015

 


 
REC Studio is an interactive decompiler.

It reads a Windows, Linux, Mac OS X or raw executable file, and attempts to produce a C-like representation of the code and data used to build the executable file.
It has been designed to read files produced for many different targets, and it has been compiled on several host systems.

REC Studio 4 is a complete rewrite of the original REC decompiler. It uses more powerful analysis techniques such as partial Single Static Assignment (SSA), allows loading Mac OS X files and supports 32 and 64 bit binaries.
Although still under development, it has reached a stage that makes it more useful than the old Rec Studio 2.

Rec Studio 2 pages are here.

 

Features

 

As mentioned, Rec Studio 4 is still under development. Most target independent features have been completed, such as:

  • Multihost: Rec Studio runs on Windows XP/Vista/7, Ubuntu Linux, Mac OS X.
  • Symbolic information support using Dwarf 2 and partial recognition of Microsoft's PDB format.
  • C++ is partially recognized: mangled names generated by gcc are demangled, as well as inheritance described in dwarf2 is honored. However, C++ is a very broad and difficult language, so some features like templates won't likely be ever supported.
  • Types and function prototype definitions can be specified in text files. Some standard Posix and Windows APIs are already provided in the Rec Studio package.
  • Interactivity is supported, limited to definition of sections, labels and function entry points. Will need to improve it to support in-program definition of types and function parameters.
This table shows the target-specific features that have been implemented so far:

Feature x86 (ia32) x86_64 Mips PowerPC mc68k ARM
Disassembler Done Done Done Done Done Planned
PE COFF loader Done Done n/a n/a n/a n/a
ELF loader Done Done Done Done Done Planned
COFF loader Done n/a n/a n/a Done n/a
Mac OS X loader Done Done n/a Planned n/a Planned
Dwarf2 symbolic information Done Done Done Done n/a Planned
COFF symbolic information Planned n/a n/a n/a Planned n/a
Calling conventions In progress In progress In progress Planned Planned Planned
32 and 64 bits In progress In progress n/a n/a n/a n/a
Floating-point Planned Planned n/a n/a n/a n/a
Windows Debugger In progress Planned n/a n/a n/a n/a
Gdb Debugger In progress In progress n/a n/a n/a n/a

REC sources are not in the public domain.

Although REC can read Win32 executable (aka PE) files produced by Visual C++ or Visual Basic 5, there are limitations on the output produced. REC will try to use whatever information is present in the .EXE symbol table. If the .EXE file was compiled without debugging information, if a program data base file (.PDB) or Codeview (C7) format was used, or if the optimization option of the compiler was enabled, the output produced will not be very good. Moreover, Visual Basic 5 executable files are a mix of Subroutine code and Form data. It is almost impossible for REC to determine which is which. The only option is to use a .cmd file and manually specify which area is code and which area is data.

In practice, only C executable files produce meaningful decompiled output.


Translations

References

 

Several other decompilers are available from various sources. Look at my reverse engineering page for a list.

Rather surprisingly, the internal architecture of a decompiler is very similar to that of a compiler. High-quality literature exists for both. The Design Notes page has information on the problems that a decompiler writer faces when trying to decompile slightly more complex programs than simple unit tests.
The decompilation page has links and documentation related to decompilers in general.

Mike van Emmerik's PhD thesis significantly advanced the field of decompilation by outlining solutions for fundamental problems in the decompilation of binary programs.

Cristina Cifuentes' Reverse Compilation Techniques PhD thesis describes in details the theory and implementation of the dcc decompiler for 8086 DOS programs.

The Wotsit page has links to the specifications of object file formats like COFF and ELF.

Some concepts related to code analysis are covered in the REference Debugger pages.

Other fundamental books I used during the development are:

  • "Compilers - Principles, Techniques and Tools", Aho, Sethi, Ullman, 1986 Addison-Wesley Publishing Co.  ISBN 0-201-10088-6.
  • "Advanced Compiler Design & Implementation", Steven Muchnick, 1997 Morgan Kaufmann Publishers, ISBN 1-55860-320-4.
  • "How debuggers work - Algorithms, Data Structures, and Architecture", Jonathan Rosemberg, 1996 John Wiley and Sons, ISBN 0-471-14966-7.
The disassemblers used in REC were taken from various sources. The file copyrite in the distribution has a list of credits for each of the disassemblers used in REC. The rest of the code was written by myself during the last 25 years. I will continue to improve REC in my spare time, but I cannot guarantee that I can fix bugs or add new features, processors, or hosts.

 

Disclaimer

 

There is a lot of discussion on the legality of decompilation. Decompiler tools have been available for a variety of platforms for a long time. Decompilers, along with other tools like debuggers, binary editors, disassemblers etc. should only be used when the owner of a program has the legal right to reverse engineer the program.

It has been established by the US and other countries courts that it is legal to use decompilers under the fair use clause of copyright law.

To find out when it is legal to use a decompiler, you should read the text of the following cases:

Also read a discussion on the legality of using an emulator to run a binary program on a different host.

Backer Street Software does not support the use of reverse engineering tools for illegal purposes.


Copyright © 1997 - 2015 Backer Street Software - All rights reserved.

History:
 

9 March 2011  Version 4.0 Beta: Complete rewrite of the decompiler to support more modern architectures (MachO files, x86_64).
2 July 2007  Version 2.2: Fixed decompilation of raw binaries via .cmd files. Partially implemented register constant propagation. Fixed many 68k errors.
6 May 2007  Version 2.1: Added back +batch option to RecStudio; use Ndisasm for i386; better isolation of import data for Windows binaries
20 Sep. 2005  Version 2.0d: More bug fixes for 68k
6 Sep. 2005  Version 2.0c: Support for Linux .o files and improved support for 68k
15 Aug. 2005  Version 2.0b: Maintenance release. Support for Watcom-compiled binaries and wide strings
1 Aug. 2005  Version 2.0a: Maintenance release. Fixed crashes, improved quality with Windows executables
30 May 2005  Version 2.0: Windows GUI and interactive decompilation
19 Sep. 2000  Version 1.6: Added support for SPARC.
16 Mar. 1999  Version 1.5d: Restored detection of switch(). Added support for big-endian MIPS.
6 Mar. 1999  Version 1.5: Support for import/export info in Win95 files; replaced GNU disassemblers with freeware source; fixed many crashes
22 Nov. 1998  Version 1.4a: Fixed endless loop when decompiling Win95 files; added Windows prototype files
15 Nov. 1998  Version 1.4: Added browser capability in interactive mode, and HTML page generation
30 Jul. 1998 
Version 1.3b: Maintenance: fixed crashes and various problems in 68k.
15 Feb. 1998  Version 1.3: Added Motorola 68000 and PowerPC targets.
7 Dec. 1997  Version 1.2: fixed PC's user interface. Now we can load 16 bits DOS executables. More bug fixes.
26 Oct. 1997  Version 1.1: multi-target support (386 + R3000), loading of ELF and PE files, several bugs fixed.
6 Oct. 1997  Ported to Windows in console mode (recr4kpc.zip) and to SunOS (recr4ks4.tar.gz)
20 Sep. 1997  Created to make recr4kl.zip available.


CG's Home Page

Last updated: November 22, 2015