Almost one month since our last article, time flees. This article is an interview of a new contributor, that greatly enhanced one of the most visually impressive feature of radare2, the one that our
propaganda department contributors loves to show at conferences!
Hi, I'm ret2libc, I was an IDA addicted and this is my 10th day that I don't use IDA.
Just joking, I still use IDA, but I'd really love to switch in the future, when r2 will be good enough. I am a Computer Science student, I like security related stuff, RE, malwares and... CTFs!
- Why contributing to radare2 ?
Just use IDA instead, like everyone else.
Well, there are just plenty of bugs to fix and it's really fun. r2 is not perfect (yet), but it's an opportunity to be involved in a very good project and full of
anal nice people. Also, I was really annoyed by the fact that I have to switch to Windows or a Windows VM everytime I just want to analyze a binary, even a simple one.
Instead r2 works almost everywhere, I just need my terminal. It's free, it's cool and it can be a good enough tool to work with binaries from some CTF.
So, let's try it, but... wait, there's only plain disassembly here, where is my graph?? Ok,
VV and, well, I've found what I wanted to improve ;)
- You have fixed a big pile of bugs regarding graphs, that's pretty impressive. Do you have a strong background in programming/maths ?
Let's say I have a background in programming/maths. As I mentioned, I'm a Computer Science student.
I think it's one of the fundamental feature a disassembler should have. It makes you understand at a glance what's going on in a program and many times it's the only thing I need from IDA.
- What are you currently implementing ? Someone told me about colours in graphs.
I've just finished implementing an algorithm to have a better layout for the nodes of the ASCII-Art graph, so that you won't have anymore (at least, you shouldn't) overlapping nodes or nodes placed in really wrong positions. Don't expect IDA's graph, but it's already something ;)
Since, at the moment, you can only see one function in the ASCII-Art graph, currently I'm implementing a way to move between called functions and go back and forth between them, without having to leave the graph mode. Something like what you get in Visual Mode with
[0-9] shortcuts (#2907).
For the colours in graph, I think you will have to wait a little bit. In the meantime, if you feel brave you can start to use 'VV!', but don't say I didn't warn you.
- What is the plan for the next months ? Are you going to continue to work on r2 ?
Of course I will continue contributing to r2, in my spare time! I'm planning to focus on graph, as I've done lately. In particular I think you will see:
- issue #2907 fixed: move between called functions with [0-9] keys
- enhancement in the edges layout. They are a real mess at the moment and for a good enough graph it's one of the fundamental thing that has to be fixed.
- mouse support for node selection
- last, and also least :P, you will see colored disassembly in the graph, unless someone else want to implement it before me (you are welcome!!)
- a lot of other cool stuff, but that's another story and everything else really depends on having a good and strong starting point.
What? GUI? Pff, we just have the terminal :)
Really, I'd really like to see a cool GUI for r2, but don't count on me.
- Most hated/lover feature of r2 ?
- Most hated feature:
anal.hasnext set to true as default!
- Most loved feature: ASCII-Art graph, of course ;)
- Advices for new contributors and users ?
Contributors, just focus on something you'd like to be present in r2 and implement it, maintain it and add tests!!
Users, keep using IDA, unless you want to take the red pill and see how deep the rabbit-hole goes ;)
Maybe we will see r2 1.0 in the future! VVRRRRRRRRRRRRRRRRRRR
Example of graph
Disassembly in graph, without colours (yet)
As part of GSoC I (dkreuter) and sushant94 have been working the last three weeks on what should become the basis for a decompiler integrated with the radare2 reversing framework.
For now it's a standalone program written in Rust that can read the radare2 code format ESIL. The rough process involves generating control and data flow graphs in SSA form for the input, applying simplifications on that, similar to compilers, and picking appropriate constructs in a target language to represent the input. The result will be a more intuitive representaton of the analyzed program.
The task is hard even in theory, as a program that prints
4 could've been compiled from
print(2+2). There's no way to know. Right now however, we're just trying to get the first and simplest case (4→4) to work. But the insight, that the decompilation process is neccessarily a interpretation process, is what I try to consider in my designs.
The alternatives to Rust we considered were OCaml and C++. None of us has written Rust or OCaml before, but seemed like many C++ skills would be transferrable to Rust (at some cost of idiomaticness). The first two weeks were full of gotchas, but now I'm very comfortable with the language. The Rust IRC channel was very helpful in that regard.
Rust has plenty of cool features including a very expressive typesystem (
X<U> extends Y<Z> + Q where Z extends X<U>, etc.), a checker that ensures there are no double frees, aliasing non-const pointers or dangling pointers (without garbage collector), tagged unions, a nice build system unlike C++ (Apparently the C++ modules proposal didn't make it for C++17), type checks for metaprogramming (C++ uses duck typing instead), nicey integrated documentation generation and testing and pattern matching.
In C++ terms, all references in Rust are
const restrict * const and have move-semantics per default. It does make some tasks more tedious than normally. (eg.
swap(&x, &x) needs workarounds to compile) But I still think that these defaults will prevent more problems that they cause.
So while sushant94 has been working on parsing and representing the data coming from radare2 (with good results it seems), I've been working on the graph data structures, which turned out to be more complicated than anticipated.
Firstly, an SSA graph is doesn't only have nodes and edges, it also has one level of nesting of nodes (computations in basic-blocks) and it also has "edge-order" (A node representing subtraction needs to know which edge represents its first or second operand.) meaning that we couldn't just use a preexisting library (without adaptions). Also, the fact that Rust wants a statically determinable tree of ownership to exist clashes a bit with the requirements of a graph.
In the end we used an existing graph library for the upper level (the basic blocks) and manually manage the lower one with instruction lists in each block. Integers are used as "pointers" between nodes on both levels. Another challenge were the Phi nodes which (unlike other instructions) have a variable number of operands. They have as many operands as their containing basic block has incoming control flow edges. It leads to a lot of special casing, making the code messy. I hope to find time to revisit this later (after GSoC probably).
Once that's done we'll work on some more integration with radare2 and begin the first code that interacts with the graphs, like simplification (
2+x+2 → 4+x) and dead code elimination (
As you know, we have 2 students working on r2 for the Google Summer of Code!
As we're 3 weeks into the Summer, here's what one of our student, sushant94 has to say about what he's been working on!
It's been three weeks into GSoC and I'm having an amazing time. I am working along side dkreuter and been learning tons from him too!
Here is the repository where you can track our progress and also give us suggestions :)
We chose Rust as our language for implementation. Though at first I was a bit scared of this choice, I quickly realized how great the language is! Rust has allowed be to far more productive, after of course my initial battles with the borrow checker.
The zero-cost abstractions allowed by Rust has been a great so far!
This is a quick roundup of what I've been upto:
- Firstly, I got an ESIL parser up and running. We use this parser to convert to an IR which is easier to perform static analysis on. While the ESIL is amazing for emulation purposes, it's not so much for static analysis as it has a very large number of supported opcodes. The RadecoIR (name subject to change) is much more simplified in terms of the number of opcodes. Also, ESIL is primarily just strings (as suggested by its name "Evaluable Strings Intermediate Language") which could be pretty hard to work with.
- The next step was to build a Control Flow Graph (CFG) out of the IR. Building a control flow graph will allow us to reason about the way the control flows in the program, and hence, helps us better understand the different programming constructs that go into making it. To make debugging and visualizations easier, I first went ahead and implemented a dot format emitter (ok, I admit it, I probably did this first as I felt it was more fun :P). Just a brief word about dot, dot is a graph description language which can be used an input to graphviz to obtain visual representations of a graph. The current implementation is just a minimal dot format emitter and all the features of dot format are not fully supported. At this point, it still cannot be used on any generic graphs. I plan to expand this in the future to allow more features and work on generic graphs.
- The last part was making the CFG out of the emitted RadecoIR. This turned out to be pretty simple, however, there are still a ton of improvements to be made here :)
Apart from Radeco itself, I also helped in making r2pipe.rs which allows communication with radare2 over pipes. In the future, r2pipe.rs will be used to connect Radeco to radare2. R2Pipe.rs is great news if you're a rust person as you can now interact with radare2 and extend it to meet your needs!
Check out the documentation if you're interested :)
This is just a glimpse of what's in for the upcoming weeks:
- Integration with radare2.
- Write tests and documentation for all features. The limited documentation we currently have can be found here. This will be updated soon.
- Convert the IR into the SSA form.
- Write dataflow analysis on the SSA form and also perform optimizations like:
Here is a small example of the CFG graph that we're currently about to generate