Return Oriented Programming

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Return-oriented programming

Return-oriented programming (ROP) is a computer security exploit technique that allows an attacker to execute code in the presence of
security defenses[1][2] such as executable space protection and code signing.[3]

In this technique, an attacker gains control of the call stack to hijack program control flow and then executes carefully chosen machine
instruction sequences that are already present in the machine's memory, called "gadgets".[4][nb 1] Each gadget typically ends in a return
instruction and is located in a subroutine within the existing program and/or shared library code.[nb 1] Chained together, these gadgets allow an
attacker to perform arbitrary operations on a machine employing defenses that thwart simpler attacks.

Background
Return-oriented programming is an advanced version of a stack smashing attack. Generally, these types of attacks arise when an adversary
manipulates the call stack by taking advantage of a bug in the program, often a buffer overrun. In a buffer overrun, a function that does not
perform proper bounds checking before storing user-provided data into memory will accept more input data than it can store properly. If the
data is being written onto the stack, the excess data may overflow the space allocated to the function's variables (e.g., "locals" in the stack diagram
to the right) and overwrite the return address. This address will later be used by the function to redirect control flow back to the caller. If it has
been overwritten, control flow will be diverted to the location specified by the new return address.

In a standard buffer overrun attack, the attacker would simply write attack code (the "payload") onto the stack and then overwrite the return
address with the location of these newly written instructions. Until the late 1990s, major operating systems did not offer any protection against
these attacks; Microsoft Windows provided no buffer-overrun protections until 2004.[5] Eventually, operating systems began to combat the
exploitation of buffer overflow bugs by marking the memory where data is written as non-executable, a technique known as executable space
protection. With this enabled, the machine would refuse to execute any code located in user-writable areas of memory, preventing the attacker
from placing payload on the stack and jumping to it via a return address overwrite. Hardware support later became available to strengthen this
protection.

With data execution prevention, an adversary cannot directly execute instructions written to a buffer because the buffer's memory section is
marked as non-executable. To defeat this protection, a return-oriented programming attack does not inject malicious instructions, but rather
uses instruction sequences already present in executable memory, called "gadgets", by manipulating return addresses. A typical data execution
prevention implementation cannot defend against this attack because the adversary did not directly execute the malicious code, but rather
combined sequences of "good" instructions by changing stored return addresses; therefore the code used would be marked as executable.
Return-into-library technique

The widespread implementation of data execution prevention made traditional buffer overflow vulnerabilities difficult or impossible to exploit in
the manner described above. Instead, an attacker was restricted to code already in memory marked executable, such as the program code itself
and any linked shared libraries. Since shared libraries, such as libc, often contain subroutines for performing system calls and other functionality
potentially useful to an attacker, they are the most likely candidates for finding code to assemble an attack.

In a return-into-library attack, an attacker hijacks program control flow by exploiting a buffer overrun vulnerability, exactly as discussed above.
Instead of attempting to write an attack payload onto the stack, the attacker instead chooses an available library function and overwrites the
return address with its entry location. Further stack locations are then overwritten, obeying applicable calling conventions, to carefully pass the
proper parameters to the function so it performs functionality useful to the attacker. This technique was first presented by Solar Designer in
1997,[6] and was later extended to unlimited chaining of function calls.[7]

Borrowed code chunks

The rise of 64-bit x86 processors brought with it a change to the subroutine calling convention that required the first argument to a function to be
passed in a register instead of on the stack. This meant that an attacker could no longer set up a library function call with desired arguments just
by manipulating the call stack via a buffer overrun exploit. Shared library developers also began to remove or restrict library functions that
performed actions particularly useful to an attacker, such as system call wrappers. As a result, return-into-library attacks became much more
difficult to mount successfully.

The next evolution came in the form of an attack that used chunks of library functions, instead of entire functions themselves, to exploit buffer
overrun vulnerabilities on machines with defenses against simpler attacks.[8] This technique looks for functions that contain instruction
sequences that pop values from the stack into registers. Careful selection of these code sequences allows an attacker to put suitable values into the
proper registers to perform a function call under the new calling convention. The rest of the attack proceeds as a return-into-library attack.

Attacks
Return-oriented programming builds on the borrowed code chunks approach and extends it to provide Turing complete functionality to the
attacker, including loops and conditional branches.[9][10] Put another way, return-oriented programming provides a fully functional "language"
that an attacker can use to make a compromised machine perform any operation desired. Hovav Shacham published the technique in 2007[11]
and demonstrated how all the important programming constructs can be simulated using return-oriented programming against a target
application linked with the C standard library and containing an exploitable buffer overrun vulnerability.
A return-oriented programming attack is superior to the other attack types discussed, both in expressive power and in resistance to defensive
measures. None of the counter-exploitation techniques mentioned above, including removing potentially dangerous functions from shared
libraries altogether, are effective against a return-oriented programming attack.

On the x86-architecture

Although return-oriented programming attacks can be performed on a variety of architectures,[11] Shacham's paper and the majority of follow-up
work focus on the Intel x86 architecture. The x86 architecture is a variable-length CISC instruction set. Return-oriented programming on the x86
takes advantage of the fact that the instruction set is very "dense", that is, any random sequence of bytes is likely to be interpretable as some valid
set of x86 instructions.

It is therefore possible to search for an opcode that alters control flow, most notably the return instruction (0xC3) and then look backwards in the
binary for preceding bytes that form possibly useful instructions. These sets of instruction "gadgets" can then be chained by overwriting the
return address, via a buffer overrun exploit, with the address of the first instruction of the first gadget. The first address of subsequent gadgets is
then written successively onto the stack. At the conclusion of the first gadget, a return instruction will be executed, which will pop the address of
the next gadget off the stack and jump to it. At the conclusion of that gadget, the chain continues with the third, and so on. By chaining the small
instruction sequences, an attacker is able to produce arbitrary program behavior from pre-existing library code. Shacham asserts that given any
sufficiently large quantity of code (including, but not limited to, the C standard library), sufficient gadgets will exist for Turing-complete
functionality.[11]

An automated tool has been developed to help automate the process of locating gadgets and constructing an attack against a binary.[12] This tool,
known as ROPgadget, searches through a binary looking for potentially useful gadgets, and attempts to assemble them into an attack payload
that spawns a shell to accept arbitrary commands from the attacker.

On address space layout randomization

The address space layout randomization also has vulnerabilities. According to the paper of Shacham et al.,[13] the ASLR on 32-bit architectures is
limited by the number of bits available for address randomization. Only 16 of the 32 address bits are available for randomization, and 16 bits of
address randomization can be defeated by brute force attack in minutes. For 64-bit architectures, 40 bits of 64 are available for randomization. In
2016, brute force attack for 40-bit randomization is possible, but is unlikely to go unnoticed. Also, randomization can be defeated by de-
randomization techniques.

Even with perfect randomization, if there is any information leakage of memory contents it would help to calculate the base address of for
example a shared library at runtime.[14]
Without use of the return instruction

According to the paper of Checkoway et al.,[15] it is possible to perform return-oriented-programming on x86 and ARM architectures without
using a return instruction (0xC3 on x86). They instead used carefully crafted instruction sequences that already exist in the machine's memory to
behave like a return instruction. A return instruction has two effects: firstly, it searches for the four-byte value at the top of the stack, and sets the
instruction pointer to that value, and secondly, it increases the stack pointer value by four (equivalent to a pop operation). On the x86
architecture, sequences of jmp and pop instructions can act as a return instruction. On ARM, sequences of load and branch instructions can act
as a return instruction.

Since this new approach does not use a return instruction, it has negative implications for defense. When a defense program checks not only for
several returns but also for several jump instructions, this attack may be detected.

Defenses

G-Free

The G-Free technique was developed by Kaan Onarlioglu, Leyla Bilge, Andrea Lanzi, Davide Balzarotti, and Engin Kirda. It is a practical solution
against any possible form of return-oriented programming. The solution eliminates all unaligned free-branch instructions (instructions like RET
or CALL which attackers can use to change control flow) inside a binary executable, and protects the free-branch instructions from being used by
an attacker. The way G-Free protects the return address is similar to the XOR canary implemented by StackGuard. Further, it checks the
authenticity of function calls by appending a validation block. If the expected result is not found, G-Free causes the application to crash.[16]

Address space layout randomization

A number of techniques have been proposed to subvert attacks based on return-oriented programming.[17] Most rely on randomizing the location
of program and library code, so that an attacker cannot accurately predict the location of instructions that might be useful in gadgets and
therefore cannot mount a successful return-oriented programming attack chain. One fairly common implementation of this technique, address
space layout randomization (ASLR), loads shared libraries into a different memory location at each program load. Although widely deployed by
modern operating systems, ASLR is vulnerable to information leakage attacks and other approaches to determine the address of any known
library function in memory. If an attacker can successfully determine the location of one known instruction, the position of all others can be
inferred and a return-oriented programming attack can be constructed.
This randomization approach can be taken further by relocating all the instructions and/or other program state (registers and stack objects) of
the program separately, instead of just library locations.[18][19][20] This requires extensive runtime support, such as a software dynamic
translator, to piece the randomized instructions back together at runtime. This technique is successful at making gadgets difficult to find and
utilize, but comes with significant overhead.

Another approach, taken by kBouncer, modifies the operating system to verify that return instructions actually divert control flow back to a
location immediately following a call instruction. This prevents gadget chaining, but carries a heavy performance penalty, and is not effective
against jump-oriented programming attacks which alter jumps and other control-flow-modifying instructions instead of returns.[21]

Binary code randomization

Some modern systems such as Cloud Lambda (FaaS) and IoT remote updates use Cloud infrastructure to perform on-the-fly compilation before
software deployment. A technique that introduces variations to each instance of an executing software can dramatically increase software's
immunity to ROP attacks. Brute forcing Cloud Lambda may result in attacking several instances of the randomized software which reduces the
effectiveness of the attack. Asaf Shelly published the technique in 2017[22] and demonstrated the use of Binary Randomization in a software
update system. For every updated device, the Cloud-based service introduced variations to code, performs online compilation, and dispatched the
binary. This technique is very effective because ROP attacks rely on knowledge of the internal structure of the software. The drawback of the
technique is that the software is never fully tested before it is deployed because it is not feasible to test all variations of the randomized software.
This means that many Binary Randomization techniques are applicable for network interfaces and system programming and are less
recommended for complex algorithms.

SEHOP
Structured Exception Handler Overwrite Protection is a feature of Windows which protects against the most common stack overflow attacks,
especially against attacks on a structured exception handler.

Against control flow attacks


As small embedded systems are proliferating due to the expansion of the Internet Of Things, the need for protection of such embedded systems is
also increasing. Using Instruction Based Memory Access Control (IB-MAC) implemented in hardware, it is possible to protect low-cost embedded
systems against malicious control flow and stack overflow attacks. The protection can be provided by separating the data stack and the return
stack. However, due to the lack of a memory management unit in some embedded systems, the hardware solution cannot be applied to all
embedded systems.[23]
Against return-oriented rootkits

In 2010, Jinku Li et al. proposed[24] that a suitably modified compiler could completely eliminate return-oriented "gadgets" by replacing each
call f with the instruction sequence pushl $index; jmp f and each ret with the instruction sequence popl %reg; jmp table(%reg), where
table represents an immutable tabulation of all "legitimate" return addresses in the program and index represents a specific index into that
table.[24]: 5–6  This prevents the creation of a return-oriented gadget that returns straight from the end of a function to an arbitrary address in the
middle of another function; instead, gadgets can return only to "legitimate" return addresses, which drastically increases the difficulty of creating
useful gadgets. Li et al. claimed that "our return indirection technique essentially de-generalizes return-oriented programming back to the old
style of return-into-libc."[24] Their proof-of-concept compiler included a peephole optimization phase to deal with "certain machine instructions
which happen to contain the return opcode in their opcodes or immediate operands,"[24] such as movl $0xC3, %eax.

Pointer Authentication Codes (PAC)

The ARMv8.3-A architecture introduces a new feature at the hardware level that takes advantage of unused bits in the pointer address space to
cryptographically sign pointer addresses using a specially-designed tweakable block cipher[25][26] which signs the desired value (typically, a
return address) combined with a "local context" value (e.g., the stack pointer).

Before performing a sensitive operation (i.e., returning to the saved pointer) the signature can be checked to detect tampering or usage in the
incorrect context (e.g., leveraging a saved return address from an exploit trampoline context).

Notably the Apple A12 chips used in iPhones have upgraded to ARMv8.3 and use PACs. Linux gained support for pointer authentication within
the kernel in version 5.7 released in 2020; support for userspace applications was added in 2018.[27]

In 2022, researchers at MIT published a side-channel attack against PACs dubbed PACMAN.[28]

See also
Blind return oriented programming
Integer overflow
JIT spraying
Sigreturn-oriented programming (SROP)
Threaded code – return-oriented programming is a rediscovery of threaded code
Notes
1. Some authors use the term gadget in a somewhat different way and refer to it as mere chunks of program logic or short
sequences of opcodes crafted to perform some desired action.[29]

References
1. Vázquez, Hugo (2007-10-01). "Check Point Secure Platform 8. Sebastian Krahmer, x86-64 buffer overflow exploits and the
Hack" (https://www.pentest.es/checkpoint_hack.pdf) (PDF). borrowed code chunks exploitation technique (http://users.su
Pentest. Barcelona, Spain: Pentest Consultores. p. 219. se.com/~krahmer/no-nx.pdf), September 28, 2005
2. "Thread: CheckPoint Secure Platform Multiple Buffer 9. Abadi, M. N.; Budiu, M.; Erlingsson, Ú.; Ligatti, J. (November
Overflows" (https://www.cpug.org/forums/showthread.php/59 2005). "Control-Flow Integrity: Principles, Implementations,
80-CheckPoint-Secure-Platform-Multiple-Buffer-Overflows). and Applications". Proceedings of the 12th ACM conference
The Check Point User Group. on Computer and communications security - CCS '05. pp. 340–
3. Shacham, Hovav; Buchanan, Erik; Roemer, Ryan; Savage, 353. doi:10.1145/1102120.1102165 (https://doi.org/10.1145%
Stefan. "Return-Oriented Programming: Exploits Without Code 2F1102120.1102165). ISBN 1-59593-226-7. S2CID 3339874 (ht
Injection" (http://cseweb.ucsd.edu/~hovav/talks/blackhat08.ht tps://api.semanticscholar.org/CorpusID:3339874).
ml). Retrieved 2009-08-12. 10. Abadi, M. N.; Budiu, M.; Erlingsson, Ú.; Ligatti, J. (October
4. Buchanan, E.; Roemer, R.; Shacham, H.; Savage, S. (October 2009). "Control-flow integrity principles, implementations, and
2008). "When Good Instructions Go Bad: Generalizing Return- applications". ACM Transactions on Information and System
Oriented Programming to RISC" (http://cseweb.ucsd.edu/~ho Security. 13: 1–40. doi:10.1145/1609956.1609960 (https://doi.
vav/dist/sparc.pdf) (PDF). Proceedings of the 15th ACM org/10.1145%2F1609956.1609960). S2CID 207175177 (https://
conference on Computer and communications security - CCS api.semanticscholar.org/CorpusID:207175177).
'08. pp. 27–38. doi:10.1145/1455770.1455776 (https://doi.org/ 11. Shacham, H. (October 2007). "The geometry of innocent flesh
10.1145%2F1455770.1455776). ISBN 978-1-59593-810-7. on the bone: return-into-libc without function calls (on the
S2CID 11176570 (https://api.semanticscholar.org/CorpusID:11 x86)". Proceedings of the 14th ACM conference on Computer
176570). and communications security - CCS '07. pp. 552–561.
5. Microsoft Windows XP SP2 Data Execution Prevention (http doi:10.1145/1315245.1315313 (https://doi.org/10.1145%2F13
s://technet.microsoft.com/en-us/library/cc700810.aspx) 15245.1315313). ISBN 978-1-59593-703-2. S2CID 11639591 (h
6. Solar Designer, Return-into-lib(c) exploits (http://seclists.org/b ttps://api.semanticscholar.org/CorpusID:11639591).
ugtraq/1997/Aug/63), Bugtraq 12. Jonathan Salwan and Allan Wirth, ROPgadget - Gadgets finder
7. Nergal, Phrack 58 Article 4, return-into-lib(c) exploits (http://p and auto-roper (http://shell-storm.org/project/ROPgadget/)
hrack.org/issues/58/4.html)
13. [Shacham et al., 2004] Hovav Shacham, Matthew Page, Ben 18. Venkat, Ashish; Shamasunder, Sriskanda; Shacham, Hovav;
Pfaff, Eu-Jin Goh, Nagendra Modadugu, and Dan Boneh. On Tullsen, Dean M. (2016-01-01). "HIPStR: Heterogeneous-ISA
the effectiveness of address-space randomization. In Program State Relocation". Proceedings of the Twenty-First
Proceedings of the 11th ACM conference on Computer and International Conference on Architectural Support for
Communications Security (CCS), 2004. Programming Languages and Operating Systems. ASPLOS '16.
14. [Bennett et al., 2013] James Bennett, Yichong Lin, and New York, NY, USA: ACM: 727–741.
Thoufique Haq. The Number of the Beast, 2013. doi:10.1145/2872362.2872408 (https://doi.org/10.1145%2F28
https://www.fireeye.com/blog/threat-research/2013/02/the- 72362.2872408). ISBN 9781450340915. S2CID 7853786 (http
number-of-the-beast.html s://api.semanticscholar.org/CorpusID:7853786).
15. CHECKOWAY, S., DAVI, L., DMITRIENKO, A., SADEGHI, A.-R., 19. Hiser, J.; Nguyen-Tuong, A.; Co, M.; Hall, M.; Davidson, J. W.
SHACHAM, H., AND WINANDY, M. 2010. Return-oriented (May 2012). "ILR: Where'd My Gadgets Go?". 2012 IEEE
programming without returns. In Proceedings of CCS 2010, A. Symposium on Security and Privacy. pp. 571–585.
Keromytis and V. Shmatikov, Eds. ACM Press, 559–72 doi:10.1109/SP.2012.39 (https://doi.org/10.1109%2FSP.2012.3
16. ONARLIOGLU, K., BILGE, L., LANZI, A., BALZAROTTI, D., AND 9). ISBN 978-1-4673-1244-8. S2CID 15696223 (https://api.sem
KIRDA, E. 2010. G-Free: Defeating return-oriented anticscholar.org/CorpusID:15696223).
programming through gadget-less binaries. In Proceedings of 20. US 9135435 (https://worldwide.espacenet.com/textdoc?DB=E
ACSAC 2010, M. Franz and J. McDermott, Eds. ACM Press, 49– PODOC&IDX=US9135435), Venkat, Ashish; Krishnaswamy,
58. Arvind & Yamada, Koichi et al., "Binary translator driven
program state relocation", published 2015-09-15, assigned to
17. Skowyra, R.; Casteel, K.; Okhravi, H.; Zeldovich, N.; Streilein, W.
Intel Corp.
(October 2013). "Systematic Analysis of Defenses against
Return-Oriented Programming" (https://web.archive.org/web/ 21. Vasilis Pappas. kBouncer: Efficient and Transparent ROP
20140222184230/http://web.mit.edu/ha22286/www/papers/c Mitigation (https://www.cs.columbia.edu/~vpappas/papers/kb
onference/Systematic_Analysis_of_Defenses_Against_Return_O ouncer.pdf). April 2012.
riented_Programming.pdf) (PDF). Research in Attacks, 22. US application 2019347385 (https://worldwide.espacenet.co
Intrusions, and Defenses. Lecture Notes in Computer Science. m/textdoc?DB=EPODOC&IDX=US2019347385), Shelly, Asaf,
Vol. 8145. pp. 82–102. doi:10.1007/978-3-642-41284-4_5 (http "Security methods and systems by code mutation", published
s://doi.org/10.1007%2F978-3-642-41284-4_5). ISBN 978-3- 2019-11-14, since abandoned.
642-41283-7. Archived from the original (http://web.mit.edu/h 23. FRANCILLON, A., PERITO, D., AND CASTELLUCCIA, C. 2009.
a22286/www/papers/conference/Systematic_Analysis_of_Defe Defending embedded systems against control flow attacks. In
nses_Against_Return_Oriented_Programming.pdf) (PDF) on Proceedings of SecuCode 2009, S. Lachmund and C. Schaefer,
2014-02-22. Eds. ACM Press, 19–26.
24. Jinku LI, Zhi WANG, Xuxian JIANG, Mike GRACE, and Sina
BAHRAM. Defeating return-oriented rootkits with "return-less"
kernels. (https://www.csc2.ncsu.edu/faculty/xjiang4/pubs/EUR
OSYS10.pdf) In Proceedings of EuroSys 2010, edited by G.
Muller. ACM Press, 195–208.
25. Avanzi, Roberto (2016). The QARMA Block Cipher Family (http 28. Ravichandran, Joseph; Na, Weon Taek; Lang, Jay; Yan, Mengjia
s://web.archive.org/web/20200513144451/https://eprint.iacr.o (June 2022). "PACMAN: attacking ARM pointer authentication
rg/2016/444.pdf) (PDF). IACR Transactions on Symmetric with speculative execution" (https://dl.acm.org/doi/10.1145/34
Cryptology (ToSC) (https://tosc.iacr.org/index.php/ToSC/articl 70496.3527429). Proceedings of the 49th Annual International
e/view/583). Vol. 17 (published 2017-03-08). pp. 4–44. Symposium on Computer Architecture. Association for
doi:10.13154/tosc.v2017.i1.4-44 (https://doi.org/10.13154%2Ft Computing Machinery. doi:10.1145/3470496.3527429 (https://
osc.v2017.i1.4-44). Archived from the original (https://eprint.ia doi.org/10.1145%2F3470496.3527429).
cr.org/2016/444.pdf) (PDF) on 2020-05-13. 29. Cha, Sang Kil; Pak, Brian; Brumley, David; Lipton, Richard Jay
26. Qualcomm Product Security. "Pointer Authentication on (2010-10-08) [2010-10-04]. Platform-Independent Programs
ARMv8.3" (https://www.qualcomm.com/media/documents/file (https://softsec.kaist.ac.kr/~sangkilc/papers/cha-ccs10.pdf)
s/whitepaper-pointer-authentication-on-armv8-3.pdf) (PDF). (PDF). Proceedings of the 17th ACM conference on Computer
Qualcomm Technologies Inc. Archived (https://web.archive.or and Communications Security (CCS'10). Chicago, Illinois, USA:
g/web/20200606102037/https://www.qualcomm.com/media/ Carnegie Mellon University, Pittsburgh, Pennsylvania, USA /
documents/files/whitepaper-pointer-authentication-on-armv8 Georgia Institute of Technology, Atlanta, Georgia, USA.
-3.pdf) (PDF) from the original on 2020-06-06. Retrieved pp. 547–558. doi:10.1145/1866307.1866369 (https://doi.org/1
2020-06-16. "Thus, we designed QARMA, a new family of 0.1145%2F1866307.1866369). ISBN 978-1-4503-0244-9.
lightweight tweakable block ciphers." Archived (https://web.archive.org/web/20220526153147/http
27. "Linux 5.7 For 64-bit ARM Brings In-Kernel Pointer s://softsec.kaist.ac.kr/~sangkilc/papers/cha-ccs10.pdf) (PDF)
Authentication, Activity Monitors - Phoronix" (https://www.ph from the original on 2022-05-26. Retrieved 2022-05-26. [1] (ht
oronix.com/scan.php?page=news_item&px=Linux-5.7-ARM64 tps://web.archive.org/web/20220526182333/http://users.ece.c
-Updates). www.phoronix.com. Retrieved 2020-03-31. mu.edu/~sangkilc/papers/ccs10-cha.pdf) (12 pages) (See also:
[2] (https://security.ece.cmu.edu/pip/index.html)) (NB. Uses
the term gadget for chunks of program logic, in this case
divided into gadget header and gadget body.)

External links
"Computer Scientists Take Over Electronic Voting Machine With New Programming Technique" (https://www.sciencedaily.com/rele
ases/2009/08/090810161902.htm). Science Daily. 2009-08-11.
"Return-oriented Programming Attack Demo video" (https://www.youtube.com/watch?v=a8_fDdWB2-M). YouTube.
AntiJOP: a program that removes JOP/ROP vulnerabilities from assembly language code (http://zsmith.co/antijop.php)

Retrieved from "https://en.wikipedia.org/w/index.php?title=Return-oriented_programming&oldid=1115668360"

This page was last edited on 12 October 2022, at 16:00 (UTC).


Text is available under the Creative Commons Attribution-ShareAlike License 3.0;
additional terms may apply. By using this site, you agree to the Terms of Use
and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.

You might also like