Update from the GSoC 2

As part of GSoC I (dkreuter) and sushant94 have been working the last three weeks on what should become the basis for a decompiler integrated with the radare2 reversing framework.

For now it's a standalone program written in Rust that can read the radare2 code format ESIL. The rough process involves generating control and data flow graphs in SSA form for the input, applying simplifications on that, similar to compilers, and picking appropriate constructs in a target language to represent the input. The result will be a more intuitive representaton of the analyzed program.

The task is hard even in theory, as a program that prints 4 could've been compiled from print(4) or print(2+2). There's no way to know. Right now however, we're just trying to get the first and simplest case (4→4) to work. But the insight, that the decompilation process is neccessarily a interpretation process, is what I try to consider in my designs.

The alternatives to Rust we considered were OCaml and C++. None of us has written Rust or OCaml before, but seemed like many C++ skills would be transferrable to Rust (at some cost of idiomaticness). The first two weeks were full of gotchas, but now I'm very comfortable with the language. The Rust IRC channel was very helpful in that regard.
Rust has plenty of cool features including a very expressive typesystem (X<U> extends Y<Z> + Q where Z extends X<U>, etc.), a checker that ensures there are no double frees, aliasing non-const pointers or dangling pointers (without garbage collector), tagged unions, a nice build system unlike C++ (Apparently the C++ modules proposal didn't make it for C++17), type checks for metaprogramming (C++ uses duck typing instead), nicey integrated documentation generation and testing and pattern matching.
In C++ terms, all references in Rust are const restrict * const and have move-semantics per default. It does make some tasks more tedious than normally. (eg. swap(&x[4], &x[5]) needs workarounds to compile) But I still think that these defaults will prevent more problems that they cause.

So while sushant94 has been working on parsing and representing the data coming from radare2 (with good results it seems), I've been working on the graph data structures, which turned out to be more complicated than anticipated.
Firstly, an SSA graph is doesn't only have nodes and edges, it also has one level of nesting of nodes (computations in basic-blocks) and it also has "edge-order" (A node representing subtraction needs to know which edge represents its first or second operand.) meaning that we couldn't just use a preexisting library (without adaptions). Also, the fact that Rust wants a statically determinable tree of ownership to exist clashes a bit with the requirements of a graph.
In the end we used an existing graph library for the upper level (the basic blocks) and manually manage the lower one with instruction lists in each block. Integers are used as "pointers" between nodes on both levels. Another challenge were the Phi nodes which (unlike other instructions) have a variable number of operands. They have as many operands as their containing basic block has incoming control flow edges. It leads to a lot of special casing, making the code messy. I hope to find time to revisit this later (after GSoC probably).

Once that's done we'll work on some more integration with radare2 and begin the first code that interacts with the graphs, like simplification (2+x+2 → 4+x) and dead code elimination (if(0){/*delete this*/}).

Update From the GSoC

As you know, we have 2 students working on r2 for the Google Summer of Code!

As we're 3 weeks into the Summer, here's what one of our student, sushant94 has to say about what he's been working on!

GSoC logo

It's been three weeks into GSoC and I'm having an amazing time. I am working along side dkreuter and been learning tons from him too!

Here is the repository where you can track our progress and also give us suggestions :)

We chose Rust as our language for implementation. Though at first I was a bit scared of this choice, I quickly realized how great the language is! Rust has allowed be to far more productive, after of course my initial battles with the borrow checker.

The zero-cost abstractions allowed by Rust has been a great so far!

This is a quick roundup of what I've been upto:

  • Firstly, I got an ESIL parser up and running. We use this parser to convert to an IR which is easier to perform static analysis on. While the ESIL is amazing for emulation purposes, it's not so much for static analysis as it has a very large number of supported opcodes. The RadecoIR (name subject to change) is much more simplified in terms of the number of opcodes. Also, ESIL is primarily just strings (as suggested by its name "Evaluable Strings Intermediate Language") which could be pretty hard to work with.
  • The next step was to build a Control Flow Graph (CFG) out of the IR. Building a control flow graph will allow us to reason about the way the control flows in the program, and hence, helps us better understand the different programming constructs that go into making it. To make debugging and visualizations easier, I first went ahead and implemented a dot format emitter (ok, I admit it, I probably did this first as I felt it was more fun :P). Just a brief word about dot, dot is a graph description language which can be used an input to graphviz to obtain visual representations of a graph. The current implementation is just a minimal dot format emitter and all the features of dot format are not fully supported. At this point, it still cannot be used on any generic graphs. I plan to expand this in the future to allow more features and work on generic graphs.
  • The last part was making the CFG out of the emitted RadecoIR. This turned out to be pretty simple, however, there are still a ton of improvements to be made here :)

Apart from Radeco itself, I also helped in making r2pipe.rs which allows communication with radare2 over pipes. In the future, r2pipe.rs will be used to connect Radeco to radare2. R2Pipe.rs is great news if you're a rust person as you can now interact with radare2 and extend it to meet your needs!

Check out the documentation if you're interested :)

This is just a glimpse of what's in for the upcoming weeks:

Bonus:

Here is a small example of the CFG graph that we're currently about to generate

cfg

Radare 0.9.9

Today, we're releasing a new version of radare2, the 0.9.9, codename Almost There. Since you might be a bit too lazy to read every single commit, we're going to highlight some cool new features!

Numbers

Thanks to more than 50 contributors who issued something like 1700 commits, here is what changed:

$ git checkout 0.9.9 && git diff 0.9.8 --shortstat  
 839 files changed, 156490 insertions(+), 18885 deletions(-)

{pancake} I would like to give a special thanks to all the new contributors that made this release possible. You can find a complete list of them in the AUTHORS file. I am still the main developer, architect and maintainer of the project, but thanks to the increased popularity of the project i'm starting to delegate some tasks and handle the development from a better perspective, teaching newcomers, priorizing features, enforcing the testsuite and much more.

Console

As you know, the current recommended way to use radare2 is its CLI interface. This is why we're doing our best to polish it. Our Windows user will be delighted to know that radare2 now works great on their platform, and has almost reached feature-parity with real operating systems, with ground-breaking features like arrow-keys support or ^C to issue SIGTERM.

Most of the w32 enhacements were done by Skuater, who fixed and tested some bugs in the windbg plugin, implemented support for FPU and MMX registers on Windows and Linux-x86-32/64, enhaced the console input to work almost as well as in linux.

One of our core-contributors happens to be a truecolor fanatic, so now radare2 can support all 16,777,216 colors!

Among various console improvements, you'll find the new variables scr.wheelspeed and scr.responsive to improve navigation.

We know that the learning curve for radare2 is super-steep, and we're sorry about this. The good news is that we checked that documentation was available for every single command, and wrote it where it was missing! You can as usual append ? to your commands to get documentation about them.

New architectures

i4004

Let's go back in time to 1971. At this time, Intel released the first general purpose programmable microprocessor on the market, the i4004

i4004 CPU

It was a blazing-fast CPU, 740 kHz, able to directly address 640 bytes of RAM! So now, 34 years later, radare2 supports this CPU.

LH5801

While we're back in time, did you know that we're supporting the good old LH5801?

pocket computer

It's a 8bit CPU that was used in the first pocket computer!

z80

The previous z80 disassembler was under GPL, had comments in German (Like LibreOffice and systemd!), was huge and a pain to maintain. Thanks to condret, we now have a clean, correct, LGPL-licenced z80 disassembler which is 75% smaller!

Pebble

If you have a pebble watch, you can now disassemble applications with radare2!

Analysis improvement

Added two alternative analysis loops with several levers to tune some options like skipping nops at the begining of the function, detect functions by following calls, handle local variables, ELF PLT and Thumb detection are now supported for ARM and ARM64;
local-flags/function-labels are also back for every supported architecture.

PE relocations are now displayed in a sexy way:
PE RELOC PE RELOC 2

There is (basic) support for CRIS analysis
Cris anal

Also, can you spot Dalvik-related enhancements?

Commands changes

We changed some commands (for good), but since they were cryptic, you probably never used it before, so you won't even notice the changes. If you do, we would appreciate your feedback.

We also added a new ones, mostly subcommands of p. Can you find them? ;)

Fixing bugs

Thanks to the amazing work of maijin, we now have our (ever-growing) testsuite running on travis to avoid regressions!

Also, jvoisin had fun fixing 75% of our coverity issues, bringing the current total to less than 150!

We also fixed bugs found by shellcheck, cppcheck, valgrind, and more!

ESIL

Remember ESIL? Our IL. Come on.

Anyway, condret has been working hard on it, mainly working in the specs and gamebody support. Nighterman has added features for x86 emulation, pancake for arm and mips and dkreuter for i8051... Emulation, here we come!

Also, congrats to sushant94 for his implementation of an ESIL to REIL translator and dkreuter for his ESIL implementation for 8051! Not bad for a first contribution, heh?

Search

We already wrote about it, but crowell added regular-expressions support to the rop-gadget finder. Also please note that the separator is now ;, and that you must quote the whole command when you use it.

Some people are using radare2 instead of binwalk to run libmagic on unknown files. This is why we optimized a bit the speed and efficiency of the /m command.

ASCII graphs

Remember when we bragged about the awesome ASCII-graph support in radare2? Well, today, thanks to r0nk, we'll brag again:

Graphs now have awesome colours by default:
Graph colours

Of course, colours are supported in the minigraph too!

Coloured minigraph

We've got two display mode for graphs edges. You can switch between them with the e key.

Graph edge styles

Teaching

Radare2 is not only used to reverse exotic binaries, or craft ingenious exploits : it's also used to teach computer science!

Radare2 comes with almost 250 fortunes, and while we think that they are super-fun, some might actually be offensive, or ill-suited for formal presentations. This is why we split them: you can now set cfg.fortunetype to tips, fun, nsfw, or any combination of them. We hope that this will help you to avoid awkward situations while doing a presentation ;)

Since not everyone is fluent with weird instructions set, radare2 comes with an asm.pseudo option, to show instructions in a more obvious way.

You can also try our proto-alpha-preprod decompiler with pdc:

decompiler

Debugger

WinDBG

TheLemonMan added support for WinDBG, the ring-0 debugger of Windows. This means that you can not only debug drivers with radare2, but also virtual machines. Imagine, breaking, modifying and stepping Windows, with radare2!

WinDBG support

Tracing

Thanks to earada, tracing is now working much better and can be displayed in the ascii-art and web graphs.

tracing

Web interface

We already said a lot about our new web interface, by pwntester, but I'm quite sure that you can't have enough of it:

  • a miminap
  • massive speedup
  • interactive graphs
  • even more contextual menu
  • hexdump
  • projects support
  • type-edition
  • variables renaming
  • debugger support
  • tracing

tracegraph

r2pipe

Since radare2 is a fast-moving target, instead of using traditional-bindings, the recommended way to call radare2 from a foreign language is to use r2pipe, which is roughly an API to communicate with an instance of radare2 using HTTP, PIPEs, TCP sockets or STDIO to run r2 commands and get the output in a string.

We'd like to take this opportunity to remind you that you just have to add j to every single command to get the output in JSON. If you're parsing raw radare2 input by yourself, you're going to have a bad time.

Currently, we have stable and mature support for Python (2+3), Go and NodeJS; but also support D, C#, Java, Ruby, Perl, Vala, NewList, Shellscript, Rust...

Packages for r2pipe are available from the python pip, ruby gem, and node.js npm package managers.

r2pipe offers a simple interface for running r2 commands over a pipe, tcp or http connection and get the output in a plain string or a JSON object. Also it have been integrated with rlang, so you can run those scripts from the shell like if it was a plain r2 script:

r2 -i stuff.py /bin/ls
[0x8040580]> . stuff.py

Misc

Most of the radare developers are using vim, but some of us prefer emacs. This is why there are now vim and emacs keybindings in visual mode!

Thanks to Aaron Puchert, we've got a new assembler for x86, cleaner, and more efficient! But we are still using the x86.nz assembler plugin which supports more instructions, and bear in mind that you can also use the x86.olly or x86.as.

Ok, what's next?

It depends of what you're going to implement of course! Now that we have a fantastic testsuite, it's way easier to contribute. The next release should focus on improving the debugging capabilities, but since we're an open-source project, it depends what our contributors are up to.

Misc

Build

Thanks to pancake, build time have been reduced (especially on Windows, where you can expect a 30% reduction!). This is why it only takes 3m37s to build radare2 from git on the UbuntuPhone.

The android application has been updated, you can now build radare2 :

  • on iOS 8.3 and its simulator
  • on latest OSX statically
  • without GPL plugins

TV

Did you know that radare2 was shown on a national (spanish) TV chain thanks to Gabriel Gonzalez from IOActive.

By the way, if you're using radare2 at work, we'll be delighted if you let us know about ;)

Bonus screenshots

Edge style comparison Comparison of the two edge styles.

R2 3D logo Awesome stereograms!

Mazda Of course radare2 runs on your Mazda!
Yes, it's a car :)