Reverse Engineering Resources - Decompilers

 

Other pages
Decompilers
Object Formats
Specifications
 
Also on this site...
Decompiler Design
REC Decompiler
RED Debugger
 
Translations
 
  Française
by Vicky Rotarova
  Russian
by Suzann Whittle

 

Decompilers

A decompiler tries to translate an object file into a compilable source file. There are many decompilers for C# or Java, but only a few for C/C++. See in particular:

  • Ghidra: An open-source decompiler developed by the U.S. National Security Agency, is an advanced interactive environment (seems inspired by IDA - below), for binary analysis and decompilation. It's written in Java, has a user interface resembling the Eclipse IDE (in fact there's also a plug-in for Eclipse). I've analyzed its implementation (the decompiler is in C++) and it has many of the features I wanted to implement in my own decompiler (see REC, below).
    Users can write their own plug-in for target-specific analysis in either Java or python! Runs on Windows, Linux, MacOS, and supports many processors. New processors can be added by writing text files to specify the processor architecture's and its instruction set.
    Overall, an excellent work, which sets a new standard for decompilers.

  • reko: Another open-source decompiler. Written in C#, it thus only run on Windows, or on platforms supporting mono. It accepts binaries compiled for many processors. It has both a GUI with all the standard views (disassembly, hexdump, C source, project), and can also be used from the command line.

  • RetDec: Originally developed by the Brno University of Technology, Czech Republic, as an on-line service, and AVG Technologies, now part of Avast, it can be downloaded from a GitHub repository and run locally.
    I have not evaluated it, but at the time I had read the paper published by the Brno University team, and it seemed at the level of the other advanced decompilers available at the time.

  • C4Decompiler: (The original link seems to be dead. I'm leaving the description here in case it becomes available again - I think I have an old version downloaded on my hard-disk)
    A new decompiler under development. Windows only, has a slick user interface inspired to Visual Studio 2010 with many useful interactions, that unfortunately are not always obvious. One has to right-click to discover them. The analysis seems very good, at least for the debug-compiled example included in the installation. Trying it on random executables from the Windows folder had mixed results, from completion of the analysis to crashes to endless loops.
    Still it's very promising, as its authors have clearly put a lot of thought and effort in its development.

  • Boomerang: open source C decompiler. Very advanced set of analyses that attempt to solve the most difficult problems facing decompilers. The generated code quality varies greatly: some functions are almost perfect in their representation of code structure, local variables and types. Other functions look highly obfuscated by the number of variables and their uses. It's also rather fragile, as it often crashes with big programs.

  • REC: My own C decompiler for Linux, DOS and Windows. The first decompiler to work on multiple platforms and that supports multiple processors (x86 16 and 32-bits, MIPS, 680x0, PowerPC). It's very stable, as it's been tested with hundreds of programs. The quality of the output is not as good as Boomerang's, since its implementation is based on 20 years old coding style (read very difficult to extend). I've now published a new version, RecStudio 4, which supports 64-bit executables. It has not been tested on as many executables, so problems still remain. Also the different analyses performed (SSA), generate totally different code that at times may seem of much worse quality (although it's probably more correct), than the code generated by the previous version.

  • Hex Rays: a decompiler plug-in for IDA Pro. The combination with IDA's advanced disassembly capabilities and run-time debugger make it the ideal choice. However it's still very new, and requires IDA Pro. Unlike the others decopilers, it's not free. It also has to stand the test of time in terms of stability. Very promising.

  • Dcc: DOS to C decompiler. One of the first decompilers. It shows its age, but it's still referenced by many other decompilers for its structuring abilities. Only supports 8086 (16 bits) programs.

  • More on other decompilers at the Program Transformation Wiki on Decompilation

Here's a comparison of the various decompilers:

Decompiler Platform Targets
Support
Binary Format
Support
Interactive
Batch
Recompilable
Output
Structuring Variables Types Notes
C4Decompiler Windows IA64 PE-COFF Interactive GUI No Very Good Good Fair  
Boomerang Windows/Linux IA32
MIPS
PPC
ELF
PE-COFF
Mac-OS
Batch with
GUI front-end
No Very good Good Very good  
REC Windows/Linux IA32
IA64
MIPS
PPC
mc68k
ELF
PE-COFF
AOUT
RAW
PS-X
Batch / Interactive No Good Fair Partial  
dcc Windows 8086 DOS .com Batch No Good Fair Poor  
Hex Rays Windows ? ? Interactive ? ? ? ?  

Testing Decompilers

The quality of a decompiler is based on how good the code it generates is, and how well it performs in the presence of "unexpected" input.

Particularly difficult problems are posed by the use of compiler optimizations which make the input code highly unstructured and difficult to understand, even for a human. Handling the following cases defines the quality of a decompiler:

No information on symbol names in the binary file (stripped executable)

Static vs. dynamically linked executable files (use pattern matching vs. dynamic linker information to identify access to library functions)