Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Foundations of ARM64 Linux

Debugging, Disassembling, and


Reversing Dmitry Vostokov
Visit to download the full and correct content document:
https://ebookmass.com/product/foundations-of-arm64-linux-debugging-disassembling
-and-reversing-dmitry-vostokov/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Foundations of ARM64 Linux Debugging, Disassembling,


and Reversing: Analyze Code, Understand Stack Memory
Usage, and Reconstruct Original C/C++ Code with ARM64
1st Edition Dmitry Vostokov
https://ebookmass.com/product/foundations-of-arm64-linux-
debugging-disassembling-and-reversing-analyze-code-understand-
stack-memory-usage-and-reconstruct-original-c-c-code-with-
arm64-1st-edition-dmitry-vostokov/

Foundations of Linux Debugging, Disassembling, and


Reversing: Analyze Binary Code, Understand Stack Memory
Usage, and Reconstruct C/C++ Code with Intel x64 1st
Edition Dmitry Vostokov
https://ebookmass.com/product/foundations-of-linux-debugging-
disassembling-and-reversing-analyze-binary-code-understand-stack-
memory-usage-and-reconstruct-c-c-code-with-intel-x64-1st-edition-
dmitry-vostokov/

Fundamentals of Trace and Log Analysis: A Pattern-


Oriented Approach to Monitoring, Diagnostics, and
Debugging 1st Edition Dmitry Vostokov

https://ebookmass.com/product/fundamentals-of-trace-and-log-
analysis-a-pattern-oriented-approach-to-monitoring-diagnostics-
and-debugging-1st-edition-dmitry-vostokov-2/

Fundamentals of Trace and Log Analysis A Pattern-


Oriented Approach to Monitoring, Diagnostics, and
Debugging 1st Edition Dmitry Vostokov

https://ebookmass.com/product/fundamentals-of-trace-and-log-
analysis-a-pattern-oriented-approach-to-monitoring-diagnostics-
and-debugging-1st-edition-dmitry-vostokov/
Python Debugging for AI, Machine Learning, and Cloud
Computing: A Pattern-Oriented Approach 1st Edition
Dmitry Vostokov

https://ebookmass.com/product/python-debugging-for-ai-machine-
learning-and-cloud-computing-a-pattern-oriented-approach-1st-
edition-dmitry-vostokov/

Python Debugging for AI, Machine Learning, and Cloud


Computing: A Pattern-Oriented Approach 1st Edition
Vostokov

https://ebookmass.com/product/python-debugging-for-ai-machine-
learning-and-cloud-computing-a-pattern-oriented-approach-1st-
edition-vostokov/

Practical Debugging at Scale: Cloud Native Debugging in


Kubernetes and Production 1st Edition Shai Almog

https://ebookmass.com/product/practical-debugging-at-scale-cloud-
native-debugging-in-kubernetes-and-production-1st-edition-shai-
almog/

Ideological Seduction and Intellectuals in Putin’s


Russia 1st Edition Edition Dmitry Shlapentokh

https://ebookmass.com/product/ideological-seduction-and-
intellectuals-in-putins-russia-1st-edition-edition-dmitry-
shlapentokh/

Philosophical Foundations of Children’s and Family Law


Elizabeth Brake

https://ebookmass.com/product/philosophical-foundations-of-
childrens-and-family-law-elizabeth-brake/
Dmitry Vostokov

Foundations of ARM64 Linux


Debugging, Disassembling, and
Reversing
Analyze Code, Understand Stack Memory Usage,
and Reconstruct Original C/C++ Code with ARM64
Dmitry Vostokov
Dublin, Ireland

ISBN 978-1-4842-9081-1 e-ISBN 978-1-4842-9082-8


https://doi.org/10.1007/978-1-4842-9082-8

© Dmitry Vostokov 2023

Apress Stanard

The use of general descriptive names, registered names, trademarks,


service marks, etc. in this publication does not imply, even in the
absence of a specific statement, that such names are exempt from the
relevant protective laws and regulations and therefore free for general
use.

The publisher, the authors and the editors are safe to assume that the
advice and information in this book are believed to be true and accurate
at the date of publication. Neither the publisher nor the authors or the
editors give a warranty, expressed or implied, with respect to the
material contained herein or for any errors or omissions that may have
been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Apress imprint is published by the registered company APress


Media, LLC, part of Springer Nature.
The registered company address is: 1 New York Plaza, New York, NY
10004, U.S.A.
Preface
The book covers topics ranging from ARM64 assembly language
instructions and writing programs in assembly language to pointers,
live debugging, and static binary analysis of compiled C and C++ code.
Diagnostics of core memory dumps, live and postmortem debugging
of Linux applications, services, and systems, memory forensics,
malware, and vulnerability analysis require an understanding of
ARM64 assembly language and how C and C++ compilers generate
code, including memory layout and pointers. This book is about
background knowledge and practical foundations that are needed to
understand internal Linux program structure and behavior, start
working with the GDB debugger, and use it for disassembly and
reversing. It consists of practical step-by-step exercises of increasing
complexity with explanations and many diagrams, including some
necessary background topics.
By the end of the book, you will have a solid understanding of how
Linux C and C++ compilers generate binary code. In addition, you will
be able to analyze such code confidently, understand stack memory
usage, and reconstruct original C/C++ code.
The book will be useful for
Software support and escalation engineers, cloud security engineers,
SRE, and DevSecOps
Software engineers coming from JVM background
Software testers
Engineers coming from non-Linux environments, for example,
Windows or Mac OS X
Engineers coming from non-ARM environments, for example,
x86/x64
Linux C/C++ software engineers without assembly language
background
Security researchers without assembly language background
Beginners learning Linux software reverse engineering techniques
This book can also be used as an ARM64 assembly language and
Linux debugging supplement for relevant undergraduate-level courses.
Source Code
All source code used in this book can be downloaded from
github.com/apress/arm64-linux-debugging-disassembling-reversing.
Any source code or other supplementary material referenced by the
author in this book is available to readers on GitHub
(https://github.com/Apress). For more detailed information, please
visit http://www.apress.com/source-code.
Table of Contents
Chapter 1:​Memory, Registers, and Simple Arithmetic
Memory and Registers Inside an Idealized Computer
Memory and Registers Inside ARM 64-Bit Computer
“Arithmetic” Project:​Memory Layout and Registers
“Arithmetic” Project:​A Computer Program
“Arithmetic” Project:​Assigning Numbers to Memory Locations
Assigning Numbers to Registers
“Arithmetic” Project:​Adding Numbers to Memory Cells
Incrementing/​Decrementing Numbers in Memory and
Registers
Multiplying Numbers
Summary
Chapter 2:​Code Optimization
“Arithmetic” Project:​C/​C++ Program
Downloading GDB
GDB Disassembly Output – No Optimization
GDB Disassembly Output – Optimization
Summary
Chapter 3:​Number Representations
Numbers and Their Representations
Decimal Representation (Base Ten)
Ternary Representation (Base Three)
Binary Representation (Base Two)
Hexadecimal Representation (Base Sixteen)
Why Are Hexadecimals Used?​
Summary
Chapter 4:​Pointers
A Definition
“Pointers” Project:​Memory Layout and Registers
“Pointers” Project:​Calculations
Using Pointers to Assign Numbers to Memory Cells
Adding Numbers Using Pointers
Incrementing Numbers Using Pointers
Multiplying Numbers Using Pointers
Summary
Chapter 5:​Bytes, Halfwords, Words, and Doublewords
Using Hexadecimal Numbers
Byte Granularity
Bit Granularity
Memory Layout
Summary
Chapter 6:​Pointers to Memory
Pointers Revisited
Addressing Types
Registers Revisited
NULL Pointers
Invalid Pointers
Variables As Pointers
Pointer Initialization
Initialized and Uninitialized Data
More Pseudo Notation
“MemoryPointers” Project:​Memory Layout
Summary
Chapter 7:​Logical Instructions and PC
Instruction Format
Logical Shift Instructions
Logical Operations
Zeroing Memory or Registers
Program Counter
Code Section
Summary
Chapter 8:​Reconstructing a Program with Pointers
Example of Disassembly Output:​No Optimization
Reconstructing C/​C++ Code:​Part 1
Reconstructing C/​C++ Code:​Part 2
Reconstructing C/​C++ Code:​Part 3
Reconstructing C/​C++ Code:​C/​C++ Program
Example of Disassembly Output:​Optimized Program
Summary
Chapter 9:​Memory and Stacks
Stack:​A Definition
Stack Implementation in Memory
Things to Remember
Stack Push Implementation
Stack Pop Implementation
Register Review
Application Memory Simplified
Stack Overflow
Jumps
Calls
Call Stack
Exploring Stack in GDB
Summary
Chapter 10:​Frame Pointer and Local Variables
Stack Usage
Register Review
Addressing Array Elements
Stack Structure (No Function Parameters)
Function Prolog
Raw Stack (No Local Variables and Function Parameters)
Function Epilog
“Local Variables” Project
Disassembly of Optimized Executable
Summary
Chapter 11:​Function Parameters
“FunctionParamete​rs” Project
Stack Structure
Function Prolog and Epilog
Project Disassembled Code with Comments
Parameter Mismatch Problem
Summary
Chapter 12:​More Instructions
PSTATE Flags
Testing for 0
TST – Logical Compare
CMP – Compare Two Operands
TST or CMP?​
Conditional Jumps
Function Return Value
Summary
Chapter 13:​Function Pointer Parameters
“FunctionPointerP​arameters” Project
Commented Disassembly
Summary
Chapter 14:​Summary of Code Disassembly Patterns
Function Prolog/​Epilog
ADR (Address)
Passing Parameters
Accessing Saved Parameters and Local Variables
Summary
Index
About the Author
Dmitry Vostokov
is an internationally recognized expert,
speaker, educator, scientist, and author.
He is the founder of the pattern-oriented
software diagnostics, forensics, and
prognostics discipline and Software
Diagnostics Institute (DA+TA:
DumpAnalysis.org + TraceAnalysis.org).
Vostokov has also authored more than
50 books on software diagnostics,
anomaly detection and analysis, software
and memory forensics, root cause
analysis and problem solving, memory
dump analysis, debugging, software
trace and log analysis, reverse engineering, and malware analysis. He
has more than 25 years of experience in software architecture, design,
development, and maintenance in various industries, including
leadership, technical, and people management roles. Dmitry also
founded Syndromatix, Anolog.io, BriteTrace, DiaThings, Logtellect,
OpenTask Iterative and Incremental Publishing (OpenTask.com),
Software Diagnostics Technology and Services (former Memory Dump
Analysis Services; PatternDiagnostics.com), and Software Prognostics.
In his spare time, he presents various topics on Debugging TV and
explores Software Narratology, its further development as Narratology
of Things and Diagnostics of Things (DoT), Software Pathology, and
Quantum Software Diagnostics. His current areas of interest are
theoretical software diagnostics and its mathematical and computer
science foundations, application of formal logic, artificial intelligence,
machine learning and data mining to diagnostics and anomaly
detection, software diagnostics engineering and diagnostics-driven
development, and diagnostics workflow and interaction. Recent areas
of interest also include cloud native computing, security, automation,
functional programming, and applications of category theory to
software development and big data.
About the Technical Reviewer
Sundar Pandian
has more than three years of experience
in embedded software development,
including development of device drivers,
middleware software, and application
services for the infotainment system on
the Android platform. He’s also
developed CAN protocol drivers for the
automotive braking system on the
Autosar platform.
He’s developed software with C, C++,
and Java and worked in the automotive,
semiconductor, and telecom industries.
He has a bachelor’s in electronics and
communication engineering. Currently,
he serves as a firmware/middleware
engineer for audio DSPs.
© The Author(s), under exclusive license to APress Media, LLC, part of Springer
Nature 2023
D. Vostokov, Foundations of ARM64 Linux Debugging, Disassembling, and Reversing
https://doi.org/10.1007/978-1-4842-9082-8_1

1. Memory, Registers, and Simple


Arithmetic
Dmitry Vostokov1
(1) Dublin, Ireland

Memory and Registers Inside an Idealized


Computer
Computer memory consists of a sequence of memory cells, and each
cell has a unique address (location). Every cell contains a “number.” We
refer to these “numbers” as contents at addresses (locations). Because
memory access is slower than arithmetic instructions, there are so-
called registers to speed up complex operations that require memory to
store temporary results. We can also think about them as stand-alone
memory cells. The name of a register is its address. Figure 1-1
illustrates this.
Figure 1-1 Computer memory represented as a sequence of memory cells and
locations

Memory and Registers Inside ARM 64-Bit Computer


Here, addresses for memory locations containing integer values usually
differ by four or eight, and we also show two registers called X0 and X1.
The first halves of them are called W0 and W1 as shown in Figure 1-2.
Figure 1-2 Typical ARM 64-bit memory and register layout
Because memory cells contain “numbers,” we start with simple
arithmetic and ask a processor to compute the sum of two numbers to
see how memory and registers change their values.

“Arithmetic” Project: Memory Layout and Registers


For our project, we have two memory addresses (locations) that we call
“a” and “b.” We can think about “a” and “b” as names of their respective
addresses (locations). Now we introduce a special notation where (a)
means contents at the memory address (location) “a.” If we use the C or
C++ language to write our project, we declare and define memory
locations “a” and “b” as

static int a, b;
By default, when we load a program, static memory locations are
filled with zeroes, and we can depict our initial memory layout after
loading the program, as shown in Figure 1-3.

Figure 1-3 Initial memory layout after loading the program

“Arithmetic” Project: A Computer Program


We can think of a computer program as a sequence of instructions for
the manipulation of contents of memory cells and registers. For
example, addition operation: add the contents of memory cell №12 to
the contents of memory cell №14. In our pseudo-code, we can write

[14] <- [14] + [12]

Our first program in pseudo-code is shown on the left of the table:

[a] <- 1 Here, we put assembly instructions corresponding to pseudo-


[b] <- 1 code.

[b] <- [b] +


[a]
[a] <- [a] + 1
[b] <- [b] *
[a]

“<-” means moving (assigning) the new value to the contents of a


memory location (address). “//” is a comment sign, and the rest of the
line is a comment. “=” shows the current value at a memory location
(address).
To remind, a code written in a high-level programming language is
translated to a machine language by a compiler. However, the machine
language can be readable if its digital codes are represented in some
mnemonic system called assembly language. For example, ADD X1, X1,
#1 is increment by one of what is stored in the register memory cell X1.

“Arithmetic” Project: Assigning Numbers to


Memory Locations
We remind that “a” means the location (address) of the memory cell,
and it is also the name of the location (address) 00000000004b2b00
(see Figure 1-3). [a] means the contents (number) stored at the address
“a.”
If we use the C or C++ language, “a” is called “the variable a,” and we
write the assignment as
a = 1;
In ARM64 assembly language, we write several instructions for that:

adr x0, a
mov w1, #1
str w1, [x0]

In the GDB disassembly output, we may see the following code:

adrp x0, 0x4b2000


add x0, x0, #0xb00
mov w1, #0x1
str w1, [x0]

We show the translation of our pseudo-code into assembly language


in the right column:

[a] <- 1 // x0 = a adr x0, a


// w1 = 1 mov w1, #1
// [a] = 1 str w1, [x0]
[b] <- 1 // x0 = b adr x0, b
// w1 = 1 mov w1, #1
// [b] = 1 str w1, [x0]
[b] <- [b] + [a]
[a] <- [a] + 1
[b] <- [b] * [a]

adrp x0, 0x4b2000, and subsequent add x0, x0, #0xb00


is how the compiler generates code to calculate the address “a” instead
of specifying it directly. Such code is required for addressing large
regions of memory, and the compiler uses it even for smaller regions
where just one adr instruction is sufficient.
Literal constants have the # prefix, for example, #0x1. The 0x prefix
means the following number is hexadecimal. We explain such numbers
in Chapter 3. Please also notice that the movement direction is the same
Another random document with
no related content on Scribd:
OR IMPLIED, INCLUDING BUT NOT LIMITED TO
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR
ANY PURPOSE.

1.F.5. Some states do not allow disclaimers of certain implied


warranties or the exclusion or limitation of certain types of
damages. If any disclaimer or limitation set forth in this
agreement violates the law of the state applicable to this
agreement, the agreement shall be interpreted to make the
maximum disclaimer or limitation permitted by the applicable
state law. The invalidity or unenforceability of any provision of
this agreement shall not void the remaining provisions.

1.F.6. INDEMNITY - You agree to indemnify and hold the


Foundation, the trademark owner, any agent or employee of the
Foundation, anyone providing copies of Project Gutenberg™
electronic works in accordance with this agreement, and any
volunteers associated with the production, promotion and
distribution of Project Gutenberg™ electronic works, harmless
from all liability, costs and expenses, including legal fees, that
arise directly or indirectly from any of the following which you do
or cause to occur: (a) distribution of this or any Project
Gutenberg™ work, (b) alteration, modification, or additions or
deletions to any Project Gutenberg™ work, and (c) any Defect
you cause.

Section 2. Information about the Mission of


Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new
computers. It exists because of the efforts of hundreds of
volunteers and donations from people in all walks of life.

Volunteers and financial support to provide volunteers with the


assistance they need are critical to reaching Project
Gutenberg™’s goals and ensuring that the Project Gutenberg™
collection will remain freely available for generations to come. In
2001, the Project Gutenberg Literary Archive Foundation was
created to provide a secure and permanent future for Project
Gutenberg™ and future generations. To learn more about the
Project Gutenberg Literary Archive Foundation and how your
efforts and donations can help, see Sections 3 and 4 and the
Foundation information page at www.gutenberg.org.

Section 3. Information about the Project


Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-
profit 501(c)(3) educational corporation organized under the
laws of the state of Mississippi and granted tax exempt status by
the Internal Revenue Service. The Foundation’s EIN or federal
tax identification number is 64-6221541. Contributions to the
Project Gutenberg Literary Archive Foundation are tax
deductible to the full extent permitted by U.S. federal laws and
your state’s laws.

The Foundation’s business office is located at 809 North 1500


West, Salt Lake City, UT 84116, (801) 596-1887. Email contact
links and up to date contact information can be found at the
Foundation’s website and official page at
www.gutenberg.org/contact

Section 4. Information about Donations to


the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission
of increasing the number of public domain and licensed works
that can be freely distributed in machine-readable form
accessible by the widest array of equipment including outdated
equipment. Many small donations ($1 to $5,000) are particularly
important to maintaining tax exempt status with the IRS.

The Foundation is committed to complying with the laws


regulating charities and charitable donations in all 50 states of
the United States. Compliance requirements are not uniform
and it takes a considerable effort, much paperwork and many
fees to meet and keep up with these requirements. We do not
solicit donations in locations where we have not received written
confirmation of compliance. To SEND DONATIONS or
determine the status of compliance for any particular state visit
www.gutenberg.org/donate.

While we cannot and do not solicit contributions from states


where we have not met the solicitation requirements, we know
of no prohibition against accepting unsolicited donations from
donors in such states who approach us with offers to donate.

International donations are gratefully accepted, but we cannot


make any statements concerning tax treatment of donations
received from outside the United States. U.S. laws alone swamp
our small staff.

Please check the Project Gutenberg web pages for current


donation methods and addresses. Donations are accepted in a
number of other ways including checks, online payments and
credit card donations. To donate, please visit:
www.gutenberg.org/donate.

Section 5. General Information About Project


Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could
be freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose
network of volunteer support.

Project Gutenberg™ eBooks are often created from several


printed editions, all of which are confirmed as not protected by
copyright in the U.S. unless a copyright notice is included. Thus,
we do not necessarily keep eBooks in compliance with any
particular paper edition.

Most people start at our website which has the main PG search
facility: www.gutenberg.org.

This website includes information about Project Gutenberg™,


including how to make donations to the Project Gutenberg
Literary Archive Foundation, how to help produce our new
eBooks, and how to subscribe to our email newsletter to hear
about new eBooks.

You might also like