Binary Exploitation Series (5): How to leak data?
I often read the question “How to leak data?” and I will try to give you some basic ideas on how to get some information about a target (binary, memory layout).
Format String Attacks
If you have a format string vulnerability in the given binary you can abuse that vulnerability to leak a lot of information about the target. For example, you could leak some pointers on the stack which could leak function calls to libc
, a pointer to the heap or the stack itself. You can also dump the whole binary if you don’t have access to the binary and may leak sensitive content. For more information please use available resources about format string attacks and try to solve old CTF challenges.
Resources:
- LiveOverflow: Simple Format String Attack
- LiveOverflow: Dump whole binary
- Lecture Notes (Syracuse University)
- Google ;-)
Off By One
In general, an off by one means that we do an operation which is one index off. In this case, we’ll write over the null byte at the end of a string.
Let’s say we have again a 32-byte buffer. We need to terminate the C string with a null byte because each string function reads until a null byte. Therefore, the valid usage of this buffer would be to read 31 bytes and adding a null byte at the end (buffer[31]=='\0'
). If the logic of the code doesn’t check that the string is null-terminated (in bounds of the array), a function like puts could leak the next values on the stack/heap/data section until it reaches a null byte.
Let’s do a simple example to demonstrate the importance of the null byte:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//gcc -m64 -o off_by_one off_by_one.c -no-pie -fno-stack-protector
void check_username() {
char secret[32];
char name[32];
puts("Name?");
scanf("%32s", name);
puts("Secret?");
scanf("%32s", secret);
if(strcmp(name, "admin\n") == 0) {
puts("Nope. Invalid username.");
}
else {
puts("OK");
}
puts(name);
}
int main(int argc, char **argv) {
check_username();
return 0;
}
The function check_username
reads two strings with scanf
(32 bytes) and prints only the name buffer. The problem here is, that scanf
puts a null byte after the end of the given input (first byte of secret) if we use the whole length of the string (32 byte). Therefore, the name
string will be interpreted as a longer buffer by puts
because the null byte of name (first byte of the secret) will be overwritten after inserting a secret. As a result, we will print the secret too.
./off_by_one
Name?
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Secret?
secret
OK
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAsecret
The secret will be successfully leaked. Just imagine that the secret could be anything without a null byte.
Leak via Functions
We did that already in Chapter 4 when we leaked a libc function pointer of the GOT
and redirected our execution flow to main to use the leaked libc address in our final exploit stage. We can abuse any function which prints or writes anything.
Overwrite Pointers
Another simple way is to overwrite pointers of other strings. For example, you have a pointer to a string in a data section of the binary. You also have a relative write out of bounds using an array in the data section. Then you could overwrite the pointer to the other string which is printed later.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//gcc -m64 overwrite.c -o overwrite -no-pie -fno-stack-protector
char *player;
char xxx[32];
int main(int argc, char **argv) {
puts("Welcome");
player = malloc(32);
scanf("%31s", player);
printf("Hello %s\n", player);
// here you exploit some logic or function
// e.g. overwrite characters of another string with an index (y)
// therefore, you could do something like
// xxx[y] = 'a';
// xxx[0] = 'a';
// xxx[2] = 'a';
// xxx[-10] = 'a';
// xxx[500] = 'a';
// and change the pointer of player.
player = &puts;
printf("%p", player); // we would just print and convert with pwntools
return 0;
}
./overwrite
Welcome
Test
Hello Test
0x7fa728c809c0
We successfully leaked puts of libc
and we could compute the libc
base address for further exploitation.
Uninitialized Variables
Declaring variables without initializing them afterward could also leak memory addresses and data. For example, you have a buffer of size n
and you fill the buffer with n/2
bytes. Then you’ll save the whole buffer (n
) into a file byte by byte. Therefore, you’ll write n/2 bytes which are unknown into the file.
Here you can see a simple example of such a behavior.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//gcc -m64 -o uninitialized_variable uninitialized_variable.c -no-pie
void check_username() {
char name[32];
char secret[32];
puts("Name?");
scanf("%32s", name);
puts("Secret?");
scanf("%32s", secret);
if(strcmp(name, "admin\n") == 0) {
puts("Nope. Invalid username.");
}
else {
puts("OK");
}
puts(name);
}
void x() {
char x[32];
for(int i=0; i<32; i++) {
printf("%x", x[i]);
}
}
int main(int argc, char **argv) {
check_username();
x();
return 0;
}
./uninitialized_variable
Name?
AAAA
Secret?
BBBB
OK
AAAA
4242424207f001000000010000000ffffffed74000000
We successfully leaked our secret and some other values.
Brute Force
The last approach for today is brute force. In the best case, your target will fork its process because then you have a similar memory for each execution. For example, if you have a master process that forks, the child processes have always the same stack cookie. Hence, we can leak the stack cookie byte by byte (see Chapter 6).
Brute force is also useful with position-independent code. For example, you could guess the return address and the saved base pointer of the binary by brute-forcing bytes.
Let’s say our stack looks like that:
... local parameters, buffer overflow
0x00007ffff7dd7660 saved base pointer
0x0000555555554896 return address
.... arguments
.... parent function
Let’s say that after the function returns to the parent function (0x0000555555554896
) it will print “Successful”. Now, we could overwrite the least significant byte with 00, 01, 02, 03 … until we see “Successful” (60) again. Then we can overwrite the second byte and do the same again 0060, 0160, 0260, … 7660 -> “Successful”.
Now, we know an address of the stack and an address of a binary instruction in memory which is essential if you deal with position-independent code.
This post did not cover all possible ways but it gives you an idea of how to get some data in some cases. The most important thing is, that you have to be creative with everything you know about the binary and try to understand how it behaves.
Happy Hacking!