5 minute read

Baby reverse was a beginner reversing challenge of this year’s hack.lu CTF. It was a great beginner challenge for people who are new to reversing at all.

A really cool fact was that the challenge author provided some basic “todo” list to solve the challenge.

Hey there, future Reverser!

We created this small challenge to introduce you to reverse engineering. This task might _still_ take quite some time, but trust us, it will be very rewarding!
We sadly can't spoonfeed you, but we created a set of questions that you might want to answer yourself. We expect you to google on your own and find resources.

Sooo, lets get started!

- What kind of binary have you got infront of you? (Hint: "file" command)
- How can you disassemble the file? (objdump, gdb, radare...)
- Which programs are common debuggers?
- How can I use them? (we recommend gdb with the peda plugin)
- - how can I set breakpoints?
- - in which different ways can I step through programs?
- - how can I print/examine the content of memory/addresses
- what is inside registers? what's rax, rip, rsp?
- what is the linux syscall convention?
- - In which register is the second argument?
- - In which register is the syscall number?
- - - where can I find the syscall numbers on my own linux system?
- what happens at a call instruction?
- how can I compare strings in assembly?
- .. ask your teammates for more! annoy them if anything is unclear :P
- .. if you don't got any teammates, use IRC and say that it's about the baby challenge

There is a lot of work ahead of you, and maybe some sleepless nights with a lot of googling - but it will be worth it;-)!
We are certain that with a dedicated mind you can solve this task and from there on you'll be ready for a bright future!
Don't give up, we all have been there. Stick to it and you'll be rewarded =)

Therefore, a beginner could just follow the notes and try to figure out what the program does.

Analysis of the Program

The first command we use is file to find out which architecture the binary supports.

file chall
chall: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped

We have a 64-bit binary and for that reason, we have 64-bit registers and we have to use registers for function calls. Next, we run the binary to see what it does. This is a very important step to get a general overview of a challenge.

./chall
Welcome to this Chall!
Enter the Key to win: AAAA

Okay, we can enter some data and then the program exits.

Let’s open the challenge in gdb. In my case, I will use GEF as a gdb plugin instead of peda as recommended by the todo list.

gdb ./chall
gef> start          # runs until the program is loaded

Further, we can step through the program with si (step into) to execute each instruction and have the ability to follow function calls. If we don’t want to follow function calls, we could use ni (not into). While stepping through the program, we can observe, that at some point the instructions are repeating. That means, that we have entered a loop and since the loop contains an XOR instruction, we can guess, that this is some sort of “encoding” algorithm.

Let’s restart the program and find out where our input is stored. If we step through the code, we can observe two syscalls at the beginning. The first syscall prints the message Welcome to this Chall!. We can find the syscall number in the rax register and look the syscall number up (SyscallTable64Bit).

First syscall:
rax = 1 -> write
rsi = 0x4000d7  → "Welcome to this Chall! \nEnter the Key to win:"
It prints the message we saw earlier.

Second syscall:
rax = 0 -> read
rsi = 0x4000d7  → "Welcome to this Chall! \nEnter the Key to win:"
This means that we read (from stdin) to our buffer at rsi. So basically, we write new data in the message buffer.

Afterward, we enter the following loop:

0x400098                  movzx  rdi, BYTE PTR [rsi+0x1]
0x40009d                  xor    QWORD PTR [rsi], rdi
0x4000a0                  inc    rsi
0x4000a3                  dec    rdx
0x4000a6                  jne    0x400098

The loop iterates over our message buffer (with our input) and XORs the current indexed character with the next one in the buffer and writes the result to the currently indexed character. Let the buffer be ABCD.
The algorithm will perform the following operations:

A = 0x41
B = 0x42
C = 0x43
D = 0x44
buffer = {0x41, 0x42, 0x43, 0x44}
i = index (incremented by the algorithm)
buffer[i] = buffer[i] ^ buffer[i+1] ( = 0x41^0x42 = 0x03 )
i = i + 1
buffer[i] = buffer[i] ^ buffer[i+1] ( = 0x42^0x43 = 0x01 )
...

Further, we can observe a loop index (rdx) which is decremented in each iteration. If rdx becomes zero, the loop exits at jne 0x400098

After the loop ends we can see a comparison between two buffers repz cmps BYTE PTR ds:[rsi], BYTE PTR es:[rdi]. If we recall the algorithm, we know that our buffer is referenced by rsi and therefore rdi should be another buffer. Since these two buffers should be equal we can guess that this buffer contains the “encoded” flag.

Developing a Solution

To solve this problem, we first step in gdb until the comparison and copy the content of the other buffer.

 gef➤  x/10gx $rdi
 0x40010c:    0x261838221c060d0a    0x2c42591c2b390f36
 0x40011c:    0x392d171c262c1a36    0x0709382b07014357
 0x40012c:    0x392d17131317011a    0x2e007d5c46060d0a
 0x40013c:    0x6261747274736873    0x0000747865742e00
 0x40014c:    0x0000000000000000    0x0000000000000000

Next, we have to reverse the hex data because it’s stored as little-endian. Here is a really quick and dirty algorithm that does the reversing of the string and also concatenates the parts to one string. (There is certainly a better solution e.g. print the data in gdb with the right format)

import string

str1 = "261838221c060d0a"     
str2 = "2c42591c2b390f36"
str3 = "392d171c262c1a36"
str4 = "0709382b07014357"
str5 = "392d17131317011a"
str6 = "2e007d5c46060d0a"
str7 = "6261747274736873"
str8 = "0000747865742e00"

list = [str1, str2, str3, str4, str5, str6, str7, str8]
goal = ""

for i in range(0, 8): # loops through the list of the giant words dump
    for n in range(len(list[i]),0, -2): # iterate through the string backwards / reverse the string
            goal += list[i][n-2:n]

print(goal)

Next, we want to reverse the XOR operation. Since we know that the flag might start with “flag” we have some known plaintext. In our case 1 byte/character is enough to reverse the XOR operation. Let the first character be p[0] = ‘f’ and the first ciphertext character be 0x0a we can get the next plaintext p[1] with chr(0x0a^ord('f')) = 'l'. To get the whole flag we just have to repeat the XOR operation on p[i] and c[i] to get p[i+1].

plain = "fl" # known plaintext 'flag'
counter = 1
goal = goal.decode("hex")
for char in goal[1:]:
     plain += chr(ord(char) ^ ord(plain[counter]))
     counter += 1
print plain
# flag{Yay_if_th1s_is_yer_f1rst_gnisrever_flag!}...non printables

That’s it.