Low-level programming, binary analysis, and security
This reference covers the low-level concepts that show up in binary analysis work: machine code, assembly, executable formats, linking, memory, ARM64, dynamic loading, and Objective-C internals.
Table of contents
- Fundamentals
- Compilation and linking
- System architecture
- ARM architecture
- ARM64 assembly reference
- Dynamic linking and loading
- Advanced analysis techniques
- Objective-C internals
- Anti-analysis techniques
Fundamentals
Machine code
Machine code is the lowest-level programming language that processors directly execute. It consists of binary instructions (0s and 1s) that the CPU can understand and process directly.
Key characteristics
- Pure binary format (0s and 1s)
- Directly executable by the processor
- Platform-specific
- Not human-readable
Assembly language
Assembly language is a low-level programming language that provides a more human-readable representation of machine code. It uses mnemonics to represent machine instructions.
Example assembly instruction
LDR R3, [R2]
This instruction:
LDR(Load Register) is the mnemonic for the load instructionR3is the destination register[R2]indicates that the value in R2 should be interpreted as a memory address- The brackets
[]tell the processor to use the value in R2 as a pointer to fetch data from memory
- The brackets
Labels and control flow
Labels serve multiple purposes:
- Mark locations for jump instructions
- Reference memory locations
- Define entry points
- Create readable code structure
Example of label usage:
_start:
mov r1, #5
b _start # Branch (jump) back to _start label
Compilation and linking
Compilation pipeline
The journey from source code to executable involves several stages:
graph TD
A[main.c] --> B[Preprocessor]
B --> C[main.i]
C --> D[Compiler]
D --> E[main.s]
E --> F[Assembler]
F --> G[main.o]
G --> H[Linker]
H --> I[myapp]
Preprocessing stage
- Expands
#includedirectives - Processes macros and conditional compilation
- Outputs
.ifiles - Example:
```c
// Before preprocessing
#include
#define MAX_SIZE 100
// After preprocessing // (contents of stdio.h) #define MAX_SIZE 100
#### Compilation stage
- Converts preprocessed C/C++ into assembly code
- Outputs `.s` files
- Example transformation:
```c
// C code
int main() {
int x = 5;
return x;
}
// Generated assembly
.text
.global _main
_main:
mov x0, #5
ret
Assembly stage
- Converts assembly into object code
- Outputs
.ofiles (Mach-O format) - Uses tools like
clangoras
Linking stage
- Resolves external symbols
- Combines object files
- Creates final executable
- Uses
clangorld
Executable file formats
Mach-O file format structure
Mach-O header
The Mach-O header contains important metadata about the binary:
struct mach_header {
uint32_t magic; // Mach magic number identifier
cpu_type_t cputype; // CPU type identifier
cpu_subtype_t cpusubtype; // CPU subtype identifier
uint32_t filetype; // Type of file
uint32_t ncmds; // Number of load commands
uint32_t sizeofcmds; // Size of all load commands
uint32_t flags; // Flags
};
Key Mach-O sections
- Text Section (__TEXT)
- Contains executable code
- Read-only and executable
- Example:
.text .global _main _main: mov x0, #0 ret
- Data Section (__DATA)
- Initialized global variables
- Read-write permissions
- Example:
int global_var = 42; // Stored in __DATA
- BSS Section (__DATA,__bss)
- Uninitialized global variables
- Zero-initialized at runtime
- Example:
int uninit_var; // Stored in __DATA,__bss
- Read-Only Data (__TEXT,__const)
- Constant data
- String literals
- Example:
const char* msg = "Hello"; // Stored in __TEXT,__const
- Thread Local Storage (__DATA,__thread_data)
- Thread-specific variables
- Example:
__thread int thread_var; // Stored in __DATA,__thread_data
Symbol management
Symbol table structure
Symbol tables are critical data structures in executable files that connect symbolic names (such as function and variable names) to their locations in memory. They serve as lookup mechanisms for the linker and loader.
In Mach-O binaries, symbols are stored using the nlist_64 structure (for 64-bit architectures):
struct nlist_64 {
uint32_t n_strx; // Index into string table
uint8_t n_type; // Type flag (N_EXT, N_SECT, etc.)
uint8_t n_sect; // Section number where symbol is defined
uint16_t n_desc; // Description field (REFERENCE_FLAG_UNDEFINED_NON_LAZY, etc.)
uint64_t n_value; // Value (address) of the symbol
};
The key fields are:
n_strx: Index into the string table where the symbol’s name is storedn_type: Indicates if the symbol is external, defined locally, debug information, etc.n_sect: Section number where the symbol is definedn_value: The actual address/value of the symbol (or offset for undefined symbols)
Symbol resolution during static linking
During compilation and static linking:
// Compiler generates code with placeholder references
int add(int a, int b); // External symbol, to be resolved by linker
int main() {
int result = add(5, 3); // Reference to external function
return result;
}
The linker:
- Reads each object file’s symbol table
- Resolves references against definitions
- Determines final addresses for symbols
- Updates all references to point to correct locations
- Creates the final executable’s symbol table
Symbol table in binary analysis
The symbol table is important for binary analysis:
# View symbols in a binary
nm /usr/bin/ls
Output shows symbol types and addresses:
0000000100003f78 T _main # 'T' means defined, text section
U _malloc # 'U' means undefined (external)
0000000100001f50 t _helper_function # 't' means local text symbol
Strip a binary to remove symbols (anti-analysis):
strip -s binary
Dynamic symbol resolution
Global offset table (GOT) and position-independent code
Dynamic symbol resolution on ARM64 macOS/iOS uses a technique involving the Global Offset Table (GOT) and special relocations:
// Loading address from symbol stub
adrp x0, _printf@GOTPAGE
ldr x0, [x0, _printf@GOTPAGEOFF]
How this works:
- What is the GOT?
- The Global Offset Table is a data structure containing addresses of external symbols
- It’s populated by the dynamic linker (dyld) at runtime
- Enables position-independent code by avoiding hardcoded addresses
- The GOTPAGE/GOTPAGEOFF relocations:
@GOTPAGE: Calculates the page address (upper bits) of the symbol’s entry in the GOT@GOTPAGEOFF: Calculates the offset within that page (lower bits)- Combined, they form the complete address
- Two-step loading process:
adrp x0, _printf@GOTPAGE: Load the page address containing printf’s GOT entry into x0ldr x0, [x0, _printf@GOTPAGEOFF]: From that page, load the actual function address
- Benefits:
- Code can be loaded at any address without modification
- External libraries can be loaded anywhere in memory
- Supports Address Space Layout Randomization (ASLR)
- Enables lazy binding (resolving symbols only when needed)
This is equivalent to asking: “Where is printf actually located?” at runtime instead of assuming a fixed address.
Position-independent code
Position-independent code (PIC) is code that can be executed regardless of where it’s loaded in memory. To understand this concept:
Simple analogy
Consider two types of maps:
- Absolute map: Contains directions like “Go to 123 Main Street” - only works if buildings never change addresses.
- Relative map: Contains directions like “Go 3 blocks north from where you are” - works regardless of the starting point.
PIC follows the relative model. It uses instructions that work regardless of where the code is loaded in memory.
What makes code position-dependent?
Traditional (non-PIC) code often contains:
- Hardcoded memory addresses
- Direct references to specific locations
- Assumptions about where it will be loaded
For example, this position-dependent code assumes it knows exactly where function foo() is located:
// Position-dependent (assumes foo() is at exactly 0x12345678)
JUMP 0x12345678 // Direct jump to hardcoded address
What makes code position-independent?
PIC uses techniques to avoid hardcoded addresses:
- Relative addressing (offsets from current position)
- Indirection tables (GOT, PLT)
- PC-relative calculations
The same example using PIC:
// Position-independent
LOAD address_of_foo from GOT // Look up actual location
JUMP to that address // Jump to wherever foo() actually is
Technical implementation
Position-independent code relies on these key strategies:
- PC-relative addressing - Instructions reference memory locations based on the current program counter
- Indirection tables - Tables of pointers that are updated by the loader
- Relocation information - Extra data that helps the loader patch the code correctly
The GOTPAGE/GOTPAGEOFF example that follows shows how ARM64 uses PC-relative addressing to efficiently implement PIC on modern systems.
Why position-independent code matters
Position-independent code is fundamental to modern operating systems for several key reasons:
- Shared libraries and dynamic loading
- Allows a single copy of a library to be shared among multiple running processes
- Enables libraries to be loaded at any available memory address
- Supports dynamic loading and unloading of modules at runtime
- Security through ASLR
- Address Space Layout Randomization (ASLR) randomizes memory addresses to mitigate attacks
- Makes exploitation techniques like return-oriented programming (ROP) much harder
- Without PIC, ASLR would be impossible or severely limited
- Prevents attackers from reliably predicting memory addresses of code/data
- Efficient memory usage
- Multiple processes can use the same physical memory for shared library code
- Only data sections need separate copies per process
- Significantly reduces overall system memory footprint
- Modern OS requirements
- macOS, iOS, Linux, and other modern systems require PIC for executables and libraries
- Code signing and security features rely on code not being modified during loading
- 64-bit environments practically mandate PIC due to the vast address space
- Compatibility with hardware features
- Modern CPU architectures like ARM64 are designed with PIC in mind
- Special relocation types (like GOTPAGE/GOTPAGEOFF) optimize PIC performance
Without position-independent code, modern memory protection, shared library loading, and flexible dynamic loading would be much harder to implement.
Detailed GOTPAGE/GOTPAGEOFF example
The following example shows what happens when a binary calls a library function such as strlen from libc:
// Code in the binary
adrp x0, _strlen@GOTPAGE // Get page address of strlen's GOT entry
ldr x0, [x0, _strlen@GOTPAGEOFF] // Load actual address from GOT
mov x1, x20 // Move string pointer to x1
blr x0 // Call strlen indirectly
Memory layout example
Consider this memory layout:
- The code is loaded at address
0x100008000 - The GOT is at address
0x100010000 - The
strlenentry in the GOT is at0x100010088 - The actual
strlenfunction is at0x7fff2037a4b0(in libc)
Runtime behavior
- ADRP instruction (
adrp x0, _strlen@GOTPAGE):- The assembler calculates: GOT address for strlen is
0x100010088 - The page address (4KB-aligned) of this GOT entry is
0x100010000 - The ADRP instruction loads
0x100010000into x0
- The assembler calculates: GOT address for strlen is
- LDR instruction (
ldr x0, [x0, _strlen@GOTPAGEOFF]):- The offset within the page is
0x88(0x100010088 - 0x100010000) - The instruction reads memory at address
0x100010000 + 0x88 - This loads the value stored at GOT entry:
0x7fff2037a4b0(actual strlen address) - Now x0 contains the real address of strlen
- The offset within the page is
- Function call (
blr x0):- Calls the function at the address in x0
- This jumps to the actual strlen implementation at
0x7fff2037a4b0
Disassembled view
A disassembler such as Hopper might show:
0x100008000: adrp x0, #0x100010000 ; _strlen@GOTPAGE
0x100008004: ldr x0, [x0, #0x88] ; _strlen@GOTPAGEOFF
0x100008008: mov x1, x20
0x10000800C: blr x0
Advantages of this approach
- Relocation-free code: The actual
strlenaddress (0x7fff2037a4b0) never appears in the code - ASLR support: If libc loads at a different address next time, only the GOT entry changes
- Lazy binding: The GOT entry can initially point to a resolver function, filled in on first use
- Efficient: ARM64’s ADRP/LDR combo is optimized for exactly this use case
This is more efficient than older approaches that required multiple instructions or PC-relative addressing with limited range.
Advanced dyld symbol resolution examples
Lazy binding
In lazy binding, the symbol is resolved only when first called, improving startup performance:
// First call to an external function (e.g., NSLog)
// 1. Jump to stub
bl _NSLog
// Stub implementation (generated by the linker)
_NSLog:
// Jump to dyld_stub_binder which will resolve the actual address
adrp x16, ___dyld_stub_binder@GOTPAGE
ldr x16, [x16, ___dyld_stub_binder@GOTPAGEOFF]
br x16
After the first call, dyld patches the stub to directly jump to the resolved function address:
// Second call to the same function
bl _NSLog
// Patched stub now jumps directly to implementation
_NSLog:
b 0x100007fb0 // Address of actual NSLog implementation
Symbol interposition
macOS allows interposing symbols (overriding library functions), used by tools like DYLD_INSERT_LIBRARIES:
// Add to interpose.c
#include <stdio.h>
// Replacement for malloc
void* my_malloc(size_t size) {
printf("Intercepted malloc(%zu)\n", size);
// Call the original malloc
return malloc(size);
}
// Interpose structure
static const struct { void *replacement; void *original; } _interposers[]
__attribute__((section("__DATA,__interpose"))) = {
{ (void *)my_malloc, (void *)malloc }
};
After compilation and linking, dyld will redirect calls to malloc to my_malloc.
Framework symbol resolution
Resolving symbols from frameworks often involves image lookup:
// Load framework symbol
adrp x0, __NSSearchPathForDirectoriesInDomains@GOTPAGE
ldr x0, [x0, __NSSearchPathForDirectoriesInDomains@GOTPAGEOFF]
This pattern involves dyld’s two-level namespace, where each symbol reference includes both the symbol name and its originating image (library or framework).
Program initialization
The program startup sequence on macOS:
- dyld loads the program into memory
- Initializes global variables
- Sets up thread-local storage
- Resolves dynamic symbols
- Calls
_main - Program execution begins
Example of startup code:
.global _main
_main:
// Set up stack frame
stp x29, x30, [sp, #-16]!
mov x29, sp
// Call main function
bl _main
// Restore stack frame
ldp x29, x30, [sp], #16
ret
System architecture
Operating system concepts
Privilege levels and memory access
Kernel mode vs user mode
- Kernel Mode (EL1)
- Full system access
- Can read/write/execute any process memory
- Critical operations can affect entire system
- Example of kernel mode operation:
// Kernel mode code (simplified) void kernel_memory_access(void* addr, size_t size) { // Direct memory access without checks memcpy(dest, addr, size); }
- User Mode (EL0)
- Sandboxed environment
- Isolated address space
- System access through APIs
- Example of user mode operation:
// User mode code void user_memory_access(void* addr, size_t size) { // Must use system calls for privileged operations if (syscall(SYS_mprotect, addr, size, PROT_READ | PROT_WRITE) == -1) { // Handle error } }
Process management
Process creation and identification
// Process creation example
pid_t pid = fork();
if (pid == 0) {
// Child process
printf("Child PID: %d\n", getpid());
} else {
// Parent process
printf("Parent PID: %d\n", getpid());
}
System calls
// System call example
#include <syscall.h>
int main() {
// File operations through system calls
int fd = open("file.txt", O_RDONLY);
if (fd == -1) {
// Handle error
}
close(fd);
return 0;
}
Memory management
Virtual memory and page tables
// Memory mapping example
#include <sys/mman.h>
void* map_memory(size_t size) {
void* addr = mmap(NULL, size,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS,
-1, 0);
if (addr == MAP_FAILED) {
// Handle error
}
return addr;
}
Memory protection
// Memory protection example
int protect_memory(void* addr, size_t size) {
// RWX permissions
return mprotect(addr, size,
PROT_READ | PROT_WRITE | PROT_EXEC);
}
Heap vs stack memory management
graph TD
A[Memory] --> B[Stack]
A --> C[Heap]
B --> D[Automatic allocation]
B --> E[LIFO access pattern]
B --> F[Limited size]
C --> G[Dynamic allocation]
C --> H[Random access pattern]
C --> I[Larger size]
Stack memory characteristics
- Fast allocation (just move stack pointer)
- Automatic cleanup when function returns
- Limited in size (typically few MB)
- LIFO (Last In, First Out) access pattern
- Each thread has its own stack
- Used for: local variables, function parameters, return addresses
Heap memory characteristics
- Dynamic lifetime (independent of function scope)
- Slower allocation (requires memory management algorithm)
- Manual cleanup responsibility (or via garbage collection/ARC)
- Much larger capacity than stack
- Shared across threads (requires synchronization)
- Used for: objects, dynamic arrays, data structures of unknown size
Objective-C heap management
Manual allocation patterns (pre-ARC)
// Create an object
NSObject *obj = [[NSObject alloc] init];
// Use the object
NSString *description = [obj description];
// Release when done
[obj release]; // Decrements reference count
// Alternative pattern with autorelease
NSObject *obj2 = [[[NSObject alloc] init] autorelease];
// obj2 will be released when the current autorelease pool drains
Assembly for manual reference counting:
// Allocate object
adrp x0, _OBJC_CLASS_$_NSObject@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSObject@PAGEOFF]
bl _objc_msgSend // alloc
// Init object
mov x20, x0 // Save object pointer
adrp x1, L_sel_init@PAGE
ldr x1, [x1, L_sel_init@PAGEOFF]
bl _objc_msgSend
// Use object
// ...
// Release object
mov x0, x20
bl _objc_release
ARC (Automatic reference counting) patterns
// With ARC, the compiler inserts retain/release calls
{
NSObject *obj = [[NSObject alloc] init];
// Use the object
self.property = obj;
// No explicit release needed, compiler inserts it
}
ARC makes these transformations:
- Tracks object ownership throughout scope
- Inserts
retainwhen storing objects in properties/collections - Inserts
releaseat end of scope - Adds
autoreleasewhen returning objects from methods
Memory management best practices
// Use strong/weak references appropriately
@property (nonatomic, strong) NSObject *strongRef; // Owns object
@property (nonatomic, weak) NSObject *weakRef; // Doesn't own object
// Break retain cycles with weak references
@implementation Parent
@property (nonatomic, strong) Child *child; // Strong reference
@end
@implementation Child
@property (nonatomic, weak) Parent *parent; // Weak reference to break cycle
@end
// Use autorelease pools for temporary objects
@autoreleasepool {
for (int i = 0; i < 10000; i++) {
NSNumber *num = @(i); // Autoreleased object
// Process num
}
} // Pool drained, all autoreleased objects freed
Common memory management issues
Memory leaks
// Memory leak: Creating objects without releasing
- (void)leakExample {
NSMutableArray *array = [[NSMutableArray alloc] init];
[array addObject:@"item"];
// array never released in non-ARC code
// Even with ARC, this can leak:
self.observer = [[NSNotificationCenter defaultCenter]
addObserverForName:@"SomeNotification"
object:nil
queue:nil
usingBlock:^(NSNotification *note) {
[self doSomething]; // Captures self, potential cycle
}];
// Observer never removed
}
Use-after-free errors
// Use-after-free: Accessing freed memory
- (void)useAfterFree {
NSObject *obj = [[NSObject alloc] init];
[obj release]; // Object memory can be reclaimed
NSLog(@"%@", [obj description]); // Using freed memory (crash)
}
// With ARC, can still happen with dangling pointers
- (void)dangleExample {
NSObject __weak *weakObj;
@autoreleasepool {
NSObject *obj = [[NSObject alloc] init];
weakObj = obj;
// obj released at end of pool
}
NSLog(@"%@", weakObj); // nil if lucky, crash if unlucky
}
Over-release errors
// Over-release: Releasing more than retaining
- (void)overRelease {
NSObject *obj = [[NSObject alloc] init];
[obj release]; // Correct
[obj release]; // Over-release, will crash
}
Debugging memory issues
- Instruments with Leaks template for finding memory leaks
- Zombies mode to detect use-after-free errors
- Address Sanitizer (ASan) for detecting memory errors
- Malloc Stack logging to track allocation/deallocation
Thread management
Thread creation and stack
#include <pthread.h>
void* thread_function(void* arg) {
// Thread-local variables
int local_var = 42;
// Stack operations
char stack_buffer[1024];
return NULL;
}
int main() {
pthread_t thread;
pthread_create(&thread, NULL, thread_function, NULL);
pthread_join(thread, NULL);
return 0;
}
ARM architecture
Execution levels
graph TD
A[EL3: Secure Monitor] --> B[EL2: Hypervisor]
B --> C[EL1: Kernel]
C --> D[EL0: User]
- EL0 (User Mode)
- Ordinary applications
- Least privileged level
- Restricted system access
- EL1 (Kernel Mode)
- Operating system kernels
- Device drivers
- Full system access
- EL2 (Hypervisor)
- Virtual machine management
- Resource partitioning
- Virtualization support
- EL3 (Secure Monitor)
- TrustZone operations
- Secure world management
- Security state transitions
Register organization
General purpose registers
// Register usage example
.global _register_demo
_register_demo:
// Argument registers (X0-X7)
mov x0, #1 // First argument
mov x1, #2 // Second argument
// Caller-saved registers (X9-X15)
mov x9, #42 // Temporary value
bl _some_function
// Callee-saved registers (X19-X28)
stp x19, x20, [sp, #-16]! // Must preserve these
mov x19, #100
ldp x19, x20, [sp], #16
ret
Special registers
// Special register usage
.global _special_registers
_special_registers:
// Frame Pointer (X29)
mov x29, sp // Set up frame pointer
// Link Register (X30)
bl _function // X30 automatically set to return address
// Stack Pointer (SP)
sub sp, sp, #16 // Must maintain 16-byte alignment
// Zero Register
mov x0, xzr // Clear register using zero register
ret
ARM64 assembly reference
This section explains common ARM64 assembly instructions used throughout this reference. Understanding these instructions helps with binary analysis and reverse engineering.
Register usage
General purpose registers
ARM64 provides 31 general-purpose registers (x0-x30):
- X0-x7: Parameter/result registers
- Used to pass arguments to functions
x0holds the return value from functions- Can be freely modified by called functions (caller-saved)
- X8: Indirect result location register
- Used for returning structures larger than 16 bytes
- X9-x15: Temporary registers
- Caller-saved registers (not preserved across function calls)
- Function can use these without saving their previous values
- X16-x17: Intra-procedure-call scratch registers
- Used by linker for PLT stubs
- Should not be used in user code across function calls
- X18: Platform register (reserved)
- Often reserved for platform-specific purposes
- X19-x28: Callee-saved registers
- Must be preserved by functions that use them
- If used, their values must be saved and restored
- X29/fp: Frame pointer
- Points to the current function’s stack frame
- Used for accessing local variables and saved registers
- X30/lr: Link register
- Holds the return address during function calls
blinstruction automatically sets this register
Special registers
- Sp: Stack pointer
- Points to the current top of the stack
- Stack grows downward (toward lower addresses)
- Must maintain 16-byte alignment
- Pc: Program counter
- Contains address of current instruction
- Not directly accessible in most instructions
- Xzr: Zero register
- Always reads as 0
- Writes are discarded
Memory operations
Load and store instructions
ldr x0, [x1] // Load 64-bit value from memory address in x1 into x0
ldr(Load register): Reads data from memory into a register[x1]means “use the value in x1 as a memory address”- The brackets
[]indicate indirection (accessing memory) - This instruction reads 8 bytes (64 bits) from address x1
ldr w0, [x1] // Load 32-bit value from memory address in x1 into w0
w0refers to the lower 32 bits ofx0- When loading to a W register, the upper 32 bits of the X register are zeroed
str x0, [x1] // Store 64-bit value from x0 to memory address in x1
str(Store register): Writes data from a register to memory- This instruction writes 8 bytes (64 bits) to the address in x1
ldr x0, [x1, #16] // Load from address (x1+16)
- Adds an immediate offset (16) to the base address
- Useful for accessing struct fields or array elements
ldr x0, [x1, x2] // Load from address (x1+x2)
- Adds a register offset (x2) to the base address
- Good for array indexing with variable indices
Advanced memory addressing
ldr x0, [x1, #16]! // Pre-index: Update x1 to x1+16, then load from new address
!indicates pre-indexing (address is updated before access)- First updates x1 to x1+16, then loads from that address
- After execution, x1 contains the new address (x1+16)
ldr x0, [x1], #16 // Post-index: Load from x1, then update x1 to x1+16
- Post-indexing (address is updated after access)
- First loads from the address in x1, then updates x1 to x1+16
- After execution, x1 contains the new address (x1+16)
Pair operations
ldp x0, x1, [sp, #16] // Load pair: Load 16 bytes from [sp+16] into x0,x1
ldp(Load pair): Loads two consecutive registers from memory- Efficient for loading 16 bytes at once
- Commonly used in function prologues/epilogues
stp x0, x1, [sp, #-16]! // Store pair with pre-decrement
stp(Store pair): Stores two consecutive registers to memory- With
!and negative offset: allocates space on stack then stores - Critical for function prologues (saving registers)
Arithmetic and logic
Basic arithmetic
add x0, x1, x2 // x0 = x1 + x2
- Adds values in x1 and x2, stores result in x0
- Does not affect either source register
add x0, x0, #1 // x0 = x0 + 1 (increment x0)
- Immediate variant: adds a constant value (#1)
- Used for incrementing counters, pointer arithmetic
sub x0, x1, x2 // x0 = x1 - x2
- Subtracts x2 from x1, stores result in x0
sub sp, sp, #16 // SP = SP - 16 (allocate 16 bytes on stack)
- When used with SP: allocates space on the stack
- The stack grows downward, so subtraction = allocation
add sp, sp, #16 // SP = SP + 16 (deallocate 16 bytes from stack)
- When used with SP: deallocates stack space
- This is the “restore stack” operation
- Adds bytes to the stack pointer, moving it up to its previous position
- If SP was 0xFF0, it returns to 0x1000
- The memory isn’t cleared, but the pointer moves so the space is available for reuse
Multiply and divide
mul x0, x1, x2 // x0 = x1 * x2
- Multiplies x1 by x2, stores result in x0
udiv x0, x1, x2 // x0 = x1 / x2 (unsigned division)
- Divides x1 by x2 (unsigned), stores result in x0
msub x0, x1, x2, x3 // x0 = x3 - (x1 * x2)
- Multiply-subtract: multiplies x1 by x2, then subtracts from x3
Bitwise operations
and x0, x1, x2 // x0 = x1 & x2 (bitwise AND)
- Performs logical AND on each bit
- Used for masking bits (extracting specific bit fields)
orr x0, x1, x2 // x0 = x1 | x2 (bitwise OR)
- Performs logical OR on each bit
- Used for setting specific bits
eor x0, x1, x2 // x0 = x1 ^ x2 (bitwise XOR)
- Performs exclusive OR on each bit
- Useful for toggling bits, checking for changes
mvn x0, x1 // x0 = ~x1 (bitwise NOT)
- Inverts all bits in x1
Shifts and rotates
lsl x0, x1, #2 // x0 = x1 << 2 (logical shift left by 2 bits)
- Shifts all bits left, filling with zeros
- Equivalent to multiplying by 2^n (here 2^2 = 4)
lsr x0, x1, #3 // x0 = x1 >> 3 (logical shift right by 3 bits)
- Shifts all bits right, filling with zeros
- For unsigned values, equivalent to dividing by 2^n
asr x0, x1, #2 // x0 = x1 >> 2 (arithmetic shift right)
- Shifts right but preserves sign bit
- For signed values, equivalent to dividing by 2^n
Control flow
Branches
b label // Branch to label (unconditional jump)
- Changes execution to the instruction at
label - Uses PC-relative addressing (offset from current PC)
bl function // Branch with link to function
- Branches to function address
- Stores return address in link register (x30/lr)
- This is how functions are called in ARM64
blr x0 // Branch with link to address in x0
- Branches to the address stored in register x0
- Stores return address in link register
- Used for function pointers, virtual methods
ret // Return from subroutine
- Returns to address stored in x30/lr
- Usually the last instruction in a function
Conditional branches
cmp x0, x1 // Compare x0 with x1 (sets condition flags)
- Subtracts x1 from x0 without storing the result
- Sets condition flags (zero, negative, carry, overflow)
- Used before conditional branches
beq label // Branch if equal (if Z flag set)
- Branches to label if the result of comparison was equal
- Checks the Zero flag set by previous comparison
bne label // Branch if not equal (if Z flag clear)
- Branches if the compared values were not equal
bgt label // Branch if greater than (signed)
- Branches if first value was greater than second
- For signed comparisons (treats values as signed integers)
blt label // Branch if less than (signed)
- Branches if first value was less than second
b.eq label // Alternate syntax for beq (newer syntax)
- Same as beq but using newer ARM64 syntax
- Preferred in newer code
Advanced control
cbz x0, label // Compare and Branch if Zero
- If x0 equals zero, branch to label
- More efficient than separate cmp + beq
cbnz x0, label // Compare and Branch if Not Zero
- If x0 is not zero, branch to label
- Used for null pointer checks, loop condition tests
Stack operations
Basic stack usage
// Simple stack sequence
sub sp, sp, #16 // Allocate 16 bytes on stack
str x0, [sp, #8] // Store x0 at offset 8 from sp
str x1, [sp] // Store x1 at offset 0 from sp
// ... code using stack values ...
ldr x0, [sp, #8] // Restore x0 from stack
ldr x1, [sp] // Restore x1 from stack
add sp, sp, #16 // Deallocate 16 bytes (restore stack)
The above sequence:
- Allocates 16 bytes of stack space by decreasing SP
- Stores registers x0 and x1 onto the stack
- Later retrieves values back into registers
- Deallocates stack space by increasing SP
Understanding stack operations in detail
What is the stack pointer (SP)?
The stack pointer (sp) is a special register that points to the current “top” of the stack. In ARM64:
- The stack grows downward in memory (from higher to lower addresses)
spalways points to the last item pushed onto the stack- The stack is a region of memory used for temporary storage of data like:
- Local variables
- Return addresses
- Saved registers
- Function parameters that don’t fit in registers
Allocating stack space
When you allocate bytes on the stack using:
sub sp, sp, #16 // Allocate 16 bytes
What happens:
- The stack pointer value is decreased by 16 bytes
- This creates 16 bytes of new “space” on the stack
- No memory is actually modified - just the pointer moves
- This space is now available for your function to use
Visual example
Memory Before After
Address Allocation Allocation
-------- ----------- -----------
0x1000 <- SP
0x0FF8
0x0FF0 <- SP
Storing at an offset from SP
When you store data using an offset:
str x0, [sp, #8] // Store x0 at SP+8
What happens:
- The address is calculated: SP + 8
- The value in register x0 is written to that memory address
- The stack pointer itself doesn’t move
Visual example (after allocating 16 bytes)
Memory Contents
Address After Operations
-------- ---------------
0x1000 (previous data)
0x0FF8 [x0's value] <- SP+8
0x0FF0 (unused) <- SP
Complete stack example
The following example allocates stack space, uses it, and then deallocates it:
my_function:
// Prologue: allocate 16 bytes on stack
sub sp, sp, #16 // SP = SP - 16
// Store two registers on stack
str x0, [sp, #8] // Store first parameter at SP+8
str x1, [sp] // Store second parameter at SP
// Do some work with the parameters...
ldr x0, [sp, #8] // Load first parameter back into x0
add x0, x0, #5 // Add 5 to it
// Store result at SP+8
str x0, [sp, #8]
// Load results from stack
ldr x0, [sp, #8] // Load result into return register
// Epilogue: deallocate stack space
add sp, sp, #16 // SP = SP + 16
ret // Return to caller
In this example:
- The function subtracts 16 from SP to allocate space.
- It stores parameters at specific offsets (
SP+0andSP+8). - It performs calculations using those values.
- It stores the result back to the stack.
- It loads the result into x0, the return value register.
- It deallocates the stack space by adding 16 to SP.
- It returns to the caller.
Offsets from SP organize the stack frame. Each value has a predictable location, so later instructions can load it by offset.
Function prologue and epilogue
// Function prologue (standard pattern)
stp x29, x30, [sp, #-16]! // Save FP and LR, allocate 16 bytes
mov x29, sp // Set up frame pointer
sub sp, sp, #32 // Allocate 32 bytes for local variables
// Function body...
// Function epilogue (standard pattern)
add sp, sp, #32 // Deallocate locals
ldp x29, x30, [sp], #16 // Restore FP and LR, deallocate 16 bytes
ret // Return to caller
This pattern:
- Saves frame pointer and return address
- Establishes new frame pointer
- Allocates space for local variables
- When done, reverses these operations in reverse order
- Returns to caller
Advanced stack alignment
and sp, sp, #-16 // Ensure 16-byte stack alignment
- Bitwise ANDs SP with -16 (0xFFFFFFF0)
- Clears the lower 4 bits (rounds down to nearest 16)
- ARM64 requires SP to be 16-byte aligned at all times
SIMD and floating point
Floating point registers
ARM64 has 32 floating-point/SIMD registers (v0-v31):
- Can be accessed as:
b0-b31: 8-bit valuesh0-h31: 16-bit valuess0-s31: 32-bit valuesd0-d31: 64-bit valuesq0-q31: 128-bit values
Floating point load/store
ldr s0, [x0] // Load 32-bit float from address in x0 into s0
- Loads a single-precision float
ldr d0, [x0] // Load 64-bit double from address in x0 into d0
- Loads a double-precision float
Floating point arithmetic
fadd d0, d1, d2 // d0 = d1 + d2 (double precision)
- Adds two double-precision floating point values
fmul s0, s1, s2 // s0 = s1 * s2 (single precision)
- Multiplies two single-precision floating point values
SIMD instructions
fmov d0, #1.0 // Move immediate float value 1.0 into d0
- Loads an immediate floating point value into a register
add v0.4s, v1.4s, v2.4s // Add four 32-bit integers in parallel
- Adds four 32-bit lanes from v1 and v2, stores in v0
- Processes multiple elements in a single instruction
Dynamic linking and loading
Dynamic linker operation
The dynamic linker (dyld) on macOS and iOS is responsible for loading and preparing executable code at runtime.
The role of dyld (dynamic link editor)
The loading process follows these key steps:
- Load the Main Executable
- Binary mapped into memory
- Headers parsed to identify dependencies
- Path to shared libraries determined
- Load Dependent Libraries
- Recursive loading of all dependent libraries
- Resolution of library search paths
- Library validation (code signing, permissions)
- Perform Relocations
- Fixup all addresses based on actual load locations
- Update references to match the random memory layout
- Prepare for symbol resolution
- Bind Symbols
- Resolve external function and data references
- Fill in address tables (GOT, etc.)
- Implement lazy/non-lazy binding as needed
- Initialize Libraries
- Run library initialization code in dependency order
- Execute static constructors
- Prepare runtime environment
Simplified dyld implementation
The following pseudocode shows a simplified dyld loading flow:
void dyld_main(const macho_header* mainExecutable) {
// Map main executable
mapMainExecutable(mainExecutable);
// Load dependent libraries recursively
loadDependentLibraries(mainExecutable);
// Perform relocations for Position Independent Code
applyRelocations(mainExecutable);
// Bind external symbols
bindSymbols(mainExecutable);
// Run initialization routines
runInitializers(mainExecutable);
// Jump to executable's entry point
call_main(mainExecutable->entrypoint);
}
Dynamic linker control variables
The dynamic linker’s behavior can be controlled through environment variables:
# Display dyld steps during loading
export DYLD_PRINT_APIS=1
# Print loaded images
export DYLD_PRINT_LIBRARIES=1
# Modify library search paths
export DYLD_LIBRARY_PATH=/path/to/custom/libs
# Insert libraries into processes
export DYLD_INSERT_LIBRARIES=/path/to/libhook.dylib
Symbol resolution techniques
Runtime symbol lookup
When dyld (the dynamic linker) loads an executable, it performs the following process to resolve symbols:
// Simplified implementation of how dyld resolves symbols
Symbol* dyld_lookup_symbol(const char* symbolName, ImageLoader* fromImage) {
// Get string table and symbol table
const char* stringTable = fromImage->getStringTable();
const struct nlist_64* symbolTable = fromImage->getSymbolTable();
uint32_t symbolCount = fromImage->getSymbolCount();
// Iterate through symbols
for (uint32_t i = 0; i < symbolCount; i++) {
const struct nlist_64* symbol = &symbolTable[i];
// Get symbol name from string table using n_strx
const char* name = &stringTable[symbol->n_strx];
// Check if matches target symbol
if (strcmp(name, symbolName) == 0) {
// Check if symbol is exported (using n_type field)
if ((symbol->n_type & N_EXT) && !(symbol->n_type & N_STAB)) {
// Calculate actual address using n_value
return calculateActualAddress(symbol->n_value, symbol->n_sect);
}
}
}
return NULL; // Symbol not found
}
Lazy vs. non-lazy symbol binding
Mach-O uses different binding strategies for efficiency:
// Non-lazy binding: Symbols resolved at load time
// Used for data references that must be valid immediately
extern NSString *const kImportantConstant; // Non-lazy binding
// Lazy binding: Symbols resolved on first use
// Used for function calls to minimize startup time
void someRarelyCalledFunction(void); // Lazy binding
The actual binding mechanism uses stub code:
// Lazy binding stub for external function call
_external_function_stub:
adrp x16, _lazy_binding_info@PAGE // Load binding info address
ldr x16, [x16, _lazy_binding_info@PAGEOFF]
br x16 // Jump to dyld_stub_binder
When first called, dyld:
- Resolves the symbol using the symbol tables
- Updates the stub to directly point to the target function
- Future calls go directly to the target without dyld intervention
Bind opcodes
The Mach-O format uses a specialized format called “bind opcodes” to encode symbol binding information. This compact representation tells the dynamic linker how to resolve symbols:
// Binding opcodes found in LC_DYLD_INFO_ONLY load command
// Example interpretation:
BIND_OPCODE_SET_DYLIB_ORDINAL(1) // symbol from dylib index 1
BIND_OPCODE_SET_SYMBOL_TRAILING_FLAGS_IMM(0, "_printf") // symbol name
BIND_OPCODE_SET_TYPE_IMM(BIND_TYPE_POINTER) // binding type
BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB(1, 0x1000) // location
BIND_OPCODE_DO_BIND() // perform binding
Resolution process
Symbol resolution goes through these stages:
- Symbol Lookup
- Search in flat namespace or two-level namespace
- Find defining library for symbol
- Address Resolution
- Determine actual address of symbol
- Account for library base address
- GOT/Stub Patching
- Update function address tables
- Patch stubs for direct calls after first resolution
Two-level namespace
macOS uses a two-level namespace to avoid symbol conflicts:
struct two_level_hint {
uint32_t library_ordinal : 8,
symbol_index : 24;
};
Each symbol reference includes both the symbol name and the library identifier, preventing collisions across libraries with identical symbol names.
The implementation considers both symbol name and library ordinal:
// Two-level namespace resolution
Symbol* lookupTwoLevelNamespace(const char* name, int libraryOrdinal) {
// Get the specific library for this ordinal
ImageLoader* library = getLibraryForOrdinal(libraryOrdinal);
if (!library)
return NULL;
// Look only in that specific library
return library->findExportedSymbol(name);
}
Runtime symbol introspection
The symbol table enables runtime API lookups:
// Using dlsym to find a symbol at runtime
void* function_pointer = dlsym(RTLD_DEFAULT, "functionName");
if (function_pointer) {
// Cast and call the function
void (*function)(void) = (void (*)(void))function_pointer;
function();
}
At assembly level, dlsym searches through symbol tables:
// Call dlsym to find function
adrp x0, L_RTLD_DEFAULT@PAGE // RTLD_DEFAULT handle
ldr x0, [x0, L_RTLD_DEFAULT@PAGEOFF]
adrp x1, L_function_name@PAGE // Function name string
add x1, x1, L_function_name@PAGEOFF
bl _dlsym // Call dlsym
cbz x0, L_symbol_not_found // Check if NULL
blr x0 // Call the resolved function
Library loading sequence
Library search paths
When resolving a library reference, dyld searches in this order:
- @rpath - Relative path list specified by the binary
- @executable_path - Relative to the main executable
- @loader_path - Relative to the loading library
- DYLD_LIBRARY_PATH - Environment variable specified paths
- System default paths - /usr/lib, etc.
Example of @rpath usage in objective-C
// Loading a framework at runtime using @rpath
- (BOOL)loadFramework {
NSString *frameworkPath = @"@rpath/MyFramework.framework/MyFramework";
void *handle = dlopen([frameworkPath UTF8String], RTLD_LAZY);
if (!handle) {
NSLog(@"Failed to load: %s", dlerror());
return NO;
}
// Find and call initialization function
InitFunction initFunc = (InitFunction)dlsym(handle, "InitializeFramework");
if (initFunc) {
return initFunc();
}
return NO;
}
Assembly for dynamic loading:
// Prepare UTF8String call
mov x0, x20 // NSString frameworkPath
adrp x1, L_sel_UTF8String@PAGE
ldr x1, [x1, L_sel_UTF8String@PAGEOFF]
bl _objc_msgSend
mov x1, x0 // Result of UTF8String
// Call dlopen
mov w2, #1 // RTLD_LAZY
bl _dlopen
mov x19, x0 // Save handle
// Check for NULL
cbz x19, L_error_handler
// Call dlsym to find function
mov x0, x19 // Library handle
adrp x1, L_func_name@PAGE
ldr x1, [x1, L_func_name@PAGEOFF]
bl _dlsym
mov x20, x0 // Save function pointer
// Check function pointer
cbz x20, L_no_function
// Call function
blr x20
Third-party library integration patterns
Dynamically loaded plugins
// Plugin manager implementation
@implementation PluginManager
- (NSArray<id<PluginProtocol>> *)loadPlugins {
NSMutableArray *plugins = [NSMutableArray array];
NSString *pluginsDir = [[NSBundle mainBundle] pathForResource:@"Plugins" ofType:nil];
NSArray *pluginFiles = [[NSFileManager defaultManager] contentsOfDirectoryAtPath:pluginsDir error:nil];
for (NSString *pluginName in pluginFiles) {
if ([pluginName hasSuffix:@".bundle"]) {
NSString *pluginPath = [pluginsDir stringByAppendingPathComponent:pluginName];
NSBundle *pluginBundle = [NSBundle bundleWithPath:pluginPath];
if ([pluginBundle load]) {
// Get principal class that conforms to PluginProtocol
Class principalClass = [pluginBundle principalClass];
if ([principalClass conformsToProtocol:@protocol(PluginProtocol)]) {
id<PluginProtocol> plugin = [[principalClass alloc] init];
[plugins addObject:plugin];
}
}
}
}
return plugins;
}
@end
Assembly representation of bundle loading:
// Load NSBundle class
adrp x0, _OBJC_CLASS_$_NSBundle@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSBundle@PAGEOFF]
// Call bundleWithPath:
mov x2, x21 // pluginPath string
adrp x1, L_sel_bundleWithPath@PAGE
ldr x1, [x1, L_sel_bundleWithPath@PAGEOFF]
bl _objc_msgSend
mov x19, x0 // Store NSBundle
// Call load method
mov x0, x19 // NSBundle instance
adrp x1, L_sel_load@PAGE
ldr x1, [x1, L_sel_load@PAGEOFF]
bl _objc_msgSend
cbz w0, L_load_failed // Test boolean result
// Get principal class
mov x0, x19 // NSBundle instance
adrp x1, L_sel_principalClass@PAGE
ldr x1, [x1, L_sel_principalClass@PAGEOFF]
bl _objc_msgSend
mov x20, x0 // Store Class
Static library integration
// Using a statically linked library
#import "ThirdPartyLib.h"
- (void)useStaticLibrary {
// Initialize the library
TPLManager *manager = [TPLManager sharedManager];
// Configure with API key
[manager setAPIKey:@"your-api-key"];
// Use library functionality
TPLResult *result = [manager processData:self.inputData];
// Handle result
if (result.success) {
self.outputLabel.text = result.outputString;
}
}
Assembly pattern for static library calls:
// Get singleton instance
adrp x0, _OBJC_CLASS_$_TPLManager@PAGE
ldr x0, [x0, _OBJC_CLASS_$_TPLManager@PAGEOFF]
adrp x1, L_sel_sharedManager@PAGE
ldr x1, [x1, L_sel_sharedManager@PAGEOFF]
bl _objc_msgSend
mov x19, x0 // Store manager instance
// Set API key
mov x0, x19 // Manager instance
adrp x2, L_api_key@PAGE
ldr x2, [x2, L_api_key@PAGEOFF] // API key string
adrp x1, L_sel_setAPIKey@PAGE
ldr x1, [x1, L_sel_setAPIKey@PAGEOFF]
bl _objc_msgSend
// Process data
mov x0, x19 // Manager instance
ldr x20, [x21, #8] // Load self.inputData from ivar offset
mov x2, x20 // Input data
adrp x1, L_sel_processData@PAGE
ldr x1, [x1, L_sel_processData@PAGEOFF]
bl _objc_msgSend
mov x22, x0 // Store result
CocoaPods/Swift Package integration
// Using a library added via CocoaPods or Swift Package Manager
#import <Alamofire/Alamofire.h> // Swift package
#import <AFNetworking/AFNetworking.h> // CocoaPod
- (void)makeNetworkRequest {
// Using AFNetworking (Objective-C library)
AFHTTPSessionManager *manager = [AFHTTPSessionManager manager];
[manager GET:@"https://api.example.com/data"
parameters:nil
headers:nil
progress:nil
success:^(NSURLSessionDataTask *task, id responseObject) {
NSLog(@"JSON: %@", responseObject);
}
failure:^(NSURLSessionDataTask *task, NSError *error) {
NSLog(@"Error: %@", error);
}];
// Using Alamofire from Swift (via bridging)
[AlamofireWrapper requestURL:@"https://api.example.com/profile"
completion:^(NSDictionary *result, NSError *error) {
if (error) {
NSLog(@"Error: %@", error);
} else {
NSLog(@"Result: %@", result);
}
}];
}
Assembly for external library usage:
// Get AFHTTPSessionManager
adrp x0, _OBJC_CLASS_$_AFHTTPSessionManager@PAGE
ldr x0, [x0, _OBJC_CLASS_$_AFHTTPSessionManager@PAGEOFF]
adrp x1, L_sel_manager@PAGE
ldr x1, [x1, L_sel_manager@PAGEOFF]
bl _objc_msgSend
mov x19, x0 // Store manager instance
// Set up for GET call (loads URL string)
mov x0, x19 // Manager instance
adrp x2, L_url_string@PAGE
ldr x2, [x2, L_url_string@PAGEOFF]
// Set up parameters (nil)
mov x3, xzr
// Set up headers (nil)
mov x4, xzr
// Set up progress block (nil)
mov x5, xzr
// Set up success block (complex block literal setup)
// ... block setup code for success handler ...
// Set up failure block (complex block literal setup)
// ... block setup code for failure handler ...
// Call GET method
adrp x1, L_sel_GET_parameters@PAGE
ldr x1, [x1, L_sel_GET_parameters@PAGEOFF]
bl _objc_msgSend
WebSocket Communication
// WebSocket implementation using SocketRocket library
- (void)setupWebSocket {
// Create WebSocket connection
SRWebSocket *webSocket = [[SRWebSocket alloc] initWithURL:[NSURL URLWithString:@"wss://websocket.example.com/socket"]];
webSocket.delegate = self;
// Set up request headers
NSDictionary *headers = @{@"Authorization": [NSString stringWithFormat:@"Bearer %@", self.accessToken]};
[webSocket setDelegateOperationQueue:[NSOperationQueue mainQueue]];
[webSocket setRequestCookies:[[NSHTTPCookieStorage sharedHTTPCookieStorage] cookiesForURL:webSocket.url]];
// Connect
[webSocket open];
self.webSocket = webSocket;
}
// WebSocket delegate methods
- (void)webSocket:(SRWebSocket *)webSocket didReceiveMessage:(id)message {
if ([message isKindOfClass:[NSString class]]) {
// Parse JSON message
NSError *jsonError;
NSDictionary *jsonData = [NSJSONSerialization JSONObjectWithData:[message dataUsingEncoding:NSUTF8StringEncoding]
options:0
error:&jsonError];
if (!jsonError) {
[self handleWebSocketEvent:jsonData];
}
}
}
- (void)webSocket:(SRWebSocket *)webSocket didFailWithError:(NSError *)error {
NSLog(@"WebSocket failed with error: %@", error);
// Attempt reconnection after delay
dispatch_after(dispatch_time(DISPATCH_TIME_NOW, 5 * NSEC_PER_SEC), dispatch_get_main_queue(), ^{
[self setupWebSocket];
});
}
- (void)webSocket:(SRWebSocket *)webSocket didCloseWithCode:(NSInteger)code reason:(NSString *)reason wasClean:(BOOL)wasClean {
NSLog(@"WebSocket closed: %@", reason);
self.webSocket = nil;
}
// Send message through WebSocket
- (void)sendEvent:(NSString *)eventType withData:(NSDictionary *)data {
if (self.webSocket.readyState != SR_OPEN) {
[self setupWebSocket];
return;
}
NSMutableDictionary *message = [NSMutableDictionary dictionaryWithDictionary:data];
message[@"type"] = eventType;
message[@"timestamp"] = @(floor([[NSDate date] timeIntervalSince1970] * 1000));
NSError *error;
NSData *jsonData = [NSJSONSerialization dataWithJSONObject:message options:0 error:&error];
if (!error) {
NSString *jsonString = [[NSString alloc] initWithData:jsonData encoding:NSUTF8StringEncoding];
[self.webSocket send:jsonString];
}
}
Assembly pattern for WebSocket operations:
// Create WebSocket
adrp x0, _OBJC_CLASS_$_SRWebSocket@PAGE
ldr x0, [x0, _OBJC_CLASS_$_SRWebSocket@PAGEOFF]
bl _objc_msgSend // alloc
// Create URL
adrp x20, _OBJC_CLASS_$_NSURL@PAGE
ldr x0, [x20, _OBJC_CLASS_$_NSURL@PAGEOFF]
adrp x2, L_websocket_url@PAGE
ldr x2, [x2, L_websocket_url@PAGEOFF]
adrp x1, L_sel_URLWithString@PAGE
ldr x1, [x1, L_sel_URLWithString@PAGEOFF]
bl _objc_msgSend
mov x2, x0 // URL
// Initialize WebSocket
mov x0, x21 // WebSocket (from alloc)
adrp x1, L_sel_initWithURL@PAGE
ldr x1, [x1, L_sel_initWithURL@PAGEOFF]
bl _objc_msgSend
mov x19, x0 // Store WebSocket
// Set delegate
mov x0, x19 // WebSocket
mov x2, x20 // self pointer
adrp x1, L_sel_setDelegate@PAGE
ldr x1, [x1, L_sel_setDelegate@PAGEOFF]
bl _objc_msgSend
// Open connection
mov x0, x19 // WebSocket
adrp x1, L_sel_open@PAGE
ldr x1, [x1, L_sel_open@PAGEOFF]
bl _objc_msgSend
// Store in ivar
str x19, [x20, #ivar_offset_webSocket]
GraphQL Client implementation
// GraphQL client using Apollo iOS
- (void)performGraphQLQuery {
// Create GraphQL query
UserProfileQuery *query = [[UserProfileQuery alloc] initWithUserId:self.userId];
// Execute query
[[ApolloClient shared] fetch:query
cachePolicy:NSURLRequestReloadIgnoringLocalCacheData
queue:dispatch_get_main_queue()
resultHandler:^(GraphQLQueryResult *result) {
if (result.error) {
NSLog(@"Error: %@", result.error);
return;
}
// Process data
UserProfile *profile = result.data.user;
self.nameLabel.text = profile.name;
self.emailLabel.text = profile.email;
// Load avatar image
if (profile.avatarUrl) {
[self.imageLoader loadImageWithURL:profile.avatarUrl
completion:^(UIImage *image) {
self.avatarImageView.image = image;
}];
}
}];
}
// Mutation example
- (void)updateUserProfile {
UpdateUserProfileMutation *mutation = [[UpdateUserProfileMutation alloc]
initWithUserId:self.userId
name:self.nameField.text
email:self.emailField.text];
[[ApolloClient shared] perform:mutation
queue:dispatch_get_main_queue()
resultHandler:^(GraphQLMutationResult *result) {
if (result.error) {
[self showErrorAlert:result.error.localizedDescription];
} else {
[self showSuccessMessage:@"Profile updated successfully"];
}
}];
}
Advanced analysis techniques
Disassembly
Disassembly is the process of converting machine code back into assembly language. This involves:
- Reconstructing assembly instructions from binary
- Creating human-readable output
- Identifying instruction boundaries
- Mapping binary patterns to assembly mnemonics
Decompilation
Decompilation goes further than disassembly by:
- Converting assembly into higher-level languages (C/C++)
- Attempting to reconstruct original program logic
- Creating more readable and maintainable code
- Using intermediate representations (IR)
Binary analysis tools
- Binary Ninja
- Modern interface
- Powerful analysis capabilities
- Support for multiple architectures
- Ghidra
- Open-source
- Developed by NSA
- Extensive plugin ecosystem
- Hopper
- User-friendly interface
- Good for macOS and iOS analysis
- Quick analysis capabilities
Applications of binary analysis
Binary analysis tools are useful for:
- Reverse engineering
- Security research
- Vulnerability analysis
- Understanding closed-source software
- Debugging and troubleshooting
Objective-C internals
Object creation
Creating a new object
// Objective-C code
NSString *str = [[NSString alloc] initWithFormat:@"Hello, %@", name];
// Resulting assembly pattern
// Load NSString class reference
adrp x0, _OBJC_CLASS_$_NSString@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSString@PAGEOFF]
// Call +[NSString alloc]
bl _objc_msgSend // Selector is "alloc"
// Load format string and arguments
adrp x1, l_fmt@PAGE // Format string address
ldr x1, [x1, l_fmt@PAGEOFF]
mov x2, x20 // 'name' variable
// Call -[NSString initWithFormat:]
adrp x3, l_selector_initWithFormat@PAGE
ldr x1, [x3, l_selector_initWithFormat@PAGEOFF]
bl _objc_msgSend
Method dispatch
Instance method call
// Objective-C code
[myObject performAction:value];
// Resulting assembly pattern
mov x0, x19 // Load 'myObject' pointer
adrp x2, l_value@PAGE // Load 'value' parameter
ldr x2, [x2, l_value@PAGEOFF]
// Load selector
adrp x3, l_selector_performAction@PAGE
ldr x1, [x3, l_selector_performAction@PAGEOFF]
// Dynamic dispatch
bl _objc_msgSend
Property access
// Objective-C code
NSInteger count = self.itemCount;
// Resulting assembly pattern
mov x0, x19 // 'self' pointer
adrp x3, l_selector_itemCount@PAGE
ldr x1, [x3, l_selector_itemCount@PAGEOFF]
bl _objc_msgSend // Calls getter method
Memory management patterns
ARC (Automatic reference counting)
// Objective-C code
{
NSObject *temp = [[NSObject alloc] init];
[self doSomethingWith:temp];
} // temp is released automatically
// Resulting assembly pattern
// Object creation
adrp x0, _OBJC_CLASS_$_NSObject@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSObject@PAGEOFF]
bl _objc_msgSend // alloc
// Init
mov x20, x0 // Save the object
adrp x1, l_selector_init@PAGE
ldr x1, [x1, l_selector_init@PAGEOFF]
bl _objc_msgSend
// Use object
mov x0, x19 // self
mov x2, x20 // temp object
adrp x1, l_selector_doSomethingWith@PAGE
ldr x1, [x1, l_selector_doSomethingWith@PAGEOFF]
bl _objc_msgSend
// Implicit release at scope end
mov x0, x20
bl _objc_release
Common patterns
File I/O operations
// Objective-C code
NSData *data = [@"Hello" dataUsingEncoding:NSUTF8StringEncoding];
[data writeToFile:@"/path/file.txt" atomically:YES];
// Resulting assembly pattern
// Creating the NSData object
adrp x0, l_string_Hello@PAGE
ldr x0, [x0, l_string_Hello@PAGEOFF]
mov w2, #4 // NSUTF8StringEncoding value
adrp x1, l_selector_dataUsingEncoding@PAGE
ldr x1, [x1, l_selector_dataUsingEncoding@PAGEOFF]
bl _objc_msgSend
mov x19, x0 // Store NSData result
// Prepare arguments for writeToFile method
mov x0, x19 // NSData object
adrp x2, l_path_string@PAGE
ldr x2, [x2, l_path_string@PAGEOFF]
mov w3, #1 // YES for atomically parameter
adrp x1, l_selector_writeToFile_atomically@PAGE
ldr x1, [x1, l_selector_writeToFile_atomically@PAGEOFF]
bl _objc_msgSend
User interface operations
// Objective-C code
UIView *view = [[UIView alloc] initWithFrame:CGRectMake(0, 0, 100, 100)];
view.backgroundColor = [UIColor redColor];
// Resulting assembly pattern
// Load UIView class
adrp x0, _OBJC_CLASS_$_UIView@PAGE
ldr x0, [x0, _OBJC_CLASS_$_UIView@PAGEOFF]
bl _objc_msgSend // alloc
// Prepare frame values (CGRectMake)
fmov d0, #0.0 // x = 0
fmov d1, #0.0 // y = 0
fmov d2, #100.0 // width = 100
fmov d3, #100.0 // height = 100
// Call initWithFrame:
mov x20, x0 // Store UIView instance
adrp x1, l_selector_initWithFrame@PAGE
ldr x1, [x1, l_selector_initWithFrame@PAGEOFF]
bl _objc_msgSend
mov x19, x0 // Store result
// Get UIColor redColor
adrp x0, _OBJC_CLASS_$_UIColor@PAGE
ldr x0, [x0, _OBJC_CLASS_$_UIColor@PAGEOFF]
adrp x1, l_selector_redColor@PAGE
ldr x1, [x1, l_selector_redColor@PAGEOFF]
bl _objc_msgSend
mov x20, x0 // Store UIColor
// Set backgroundColor property
mov x0, x19 // UIView instance
adrp x1, l_selector_setBackgroundColor@PAGE
ldr x1, [x1, l_selector_setBackgroundColor@PAGEOFF]
mov x2, x20 // UIColor instance
bl _objc_msgSend
Networking operations
// Objective-C code
NSURL *url = [NSURL URLWithString:@"https://example.com"];
NSURLRequest *request = [NSURLRequest requestWithURL:url];
NSURLSession *session = [NSURLSession sharedSession];
NSURLSessionDataTask *task = [session dataTaskWithRequest:request
completionHandler:^(NSData *data, NSURLResponse *response, NSError *error) {
// Handle response
}];
[task resume];
// Resulting assembly pattern
// Create NSURL
adrp x0, _OBJC_CLASS_$_NSURL@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSURL@PAGEOFF]
adrp x2, l_url_string@PAGE
ldr x2, [x2, l_url_string@PAGEOFF]
adrp x1, l_selector_URLWithString@PAGE
ldr x1, [x1, l_selector_URLWithString@PAGEOFF]
bl _objc_msgSend
mov x2, x0 // URL for request
// Create NSURLRequest
adrp x0, _OBJC_CLASS_$_NSURLRequest@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSURLRequest@PAGEOFF]
mov x2, x20 // NSURL
adrp x1, l_selector_requestWithURL@PAGE
ldr x1, [x1, l_selector_requestWithURL@PAGEOFF]
bl _objc_msgSend
mov x20, x0 // Store NSURLRequest
// Get shared session
adrp x0, _OBJC_CLASS_$_NSURLSession@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSURLSession@PAGEOFF]
adrp x1, l_selector_sharedSession@PAGE
ldr x1, [x1, l_selector_sharedSession@PAGEOFF]
bl _objc_msgSend
mov x21, x0 // Store NSURLSession
// Create data task with request and completion block
mov x0, x21 // NSURLSession
mov x2, x20 // NSURLRequest
// ... complex block setup ...
adrp x1, L_sel_dataTaskWithRequest@PAGE
ldr x1, [x1, L_sel_dataTaskWithRequest@PAGEOFF]
bl _objc_msgSend
mov x22, x0 // Store task
// Resume task
mov x0, x22
adrp x1, L_sel_resume@PAGE
ldr x1, [x1, L_sel_resume@PAGEOFF]
bl _objc_msgSend
Authentication and oauth flows
// OAuth 2.0 authentication flow
- (void)authenticateWithOAuth {
// Configure OAuth parameters
NSDictionary *params = @{
@"client_id": @"your-client-id",
@"client_secret": @"your-client-secret",
@"grant_type": @"authorization_code",
@"code": self.authorizationCode,
@"redirect_uri": @"your-app://oauth-callback"
};
// Create request
NSMutableURLRequest *request = [NSMutableURLRequest requestWithURL:[NSURL URLWithString:@"https://oauth.example.com/token"]];
[request setHTTPMethod:@"POST"];
[request setValue:@"application/x-www-form-urlencoded" forHTTPHeaderField:@"Content-Type"];
// Convert parameters to form body
NSMutableArray *formItems = [NSMutableArray array];
for (NSString *key in params) {
NSString *encodedKey = [key stringByAddingPercentEncodingWithAllowedCharacters:[NSCharacterSet URLQueryAllowedCharacterSet]];
NSString *encodedValue = [params[key] stringByAddingPercentEncodingWithAllowedCharacters:[NSCharacterSet URLQueryAllowedCharacterSet]];
[formItems addObject:[NSString stringWithFormat:@"%@=%@", encodedKey, encodedValue]];
}
NSString *formBody = [formItems componentsJoinedByString:@"&"];
[request setHTTPBody:[formBody dataUsingEncoding:NSUTF8StringEncoding]];
// Execute request
NSURLSession *session = [NSURLSession sharedSession];
NSURLSessionDataTask *task = [session dataTaskWithRequest:request completionHandler:^(NSData *data, NSURLResponse *response, NSError *error) {
if (error) {
NSLog(@"Authentication error: %@", error);
return;
}
// Parse token response
NSError *jsonError;
NSDictionary *tokenResponse = [NSJSONSerialization JSONObjectWithData:data options:0 error:&jsonError];
if (jsonError) {
NSLog(@"JSON parsing error: %@", jsonError);
return;
}
// Store access token
NSString *accessToken = tokenResponse[@"access_token"];
NSString *refreshToken = tokenResponse[@"refresh_token"];
self.tokenExpiration = [NSDate dateWithTimeIntervalSinceNow:[tokenResponse[@"expires_in"] doubleValue]];
// Save tokens securely
[KeychainManager saveAccessToken:accessToken refreshToken:refreshToken];
}];
[task resume];
}
Assembly representation of complex network flow:
// Set up URL request
adrp x0, _OBJC_CLASS_$_NSMutableURLRequest@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSMutableURLRequest@PAGEOFF]
// Load NSURL
adrp x21, _OBJC_CLASS_$_NSURL@PAGE
ldr x0, [x21, _OBJC_CLASS_$_NSURL@PAGEOFF]
adrp x2, L_url_string@PAGE
ldr x2, [x2, L_url_string@PAGEOFF]
adrp x1, L_sel_URLWithString@PAGE
ldr x1, [x1, L_sel_URLWithString@PAGEOFF]
bl _objc_msgSend
mov x2, x0 // URL for request
// Create request
adrp x0, _OBJC_CLASS_$_NSMutableURLRequest@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSMutableURLRequest@PAGEOFF]
adrp x1, L_POST_method@PAGE
ldr x2, [x1, L_POST_method@PAGEOFF]
adrp x1, L_sel_setHTTPMethod@PAGE
ldr x1, [x1, L_sel_setHTTPMethod@PAGEOFF]
bl _objc_msgSend
// More request setup...
// ... (headers, parameters, etc.)
// Get shared session
adrp x0, _OBJC_CLASS_$_NSURLSession@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSURLSession@PAGEOFF]
adrp x1, L_sel_sharedSession@PAGE
ldr x1, [x1, L_sel_sharedSession@PAGEOFF]
bl _objc_msgSend
mov x20, x0 // Session object
// Create data task with request and completion block
mov x0, x20 // Session
mov x2, x20 // NSURLRequest
// ... complex block setup ...
adrp x1, L_sel_dataTaskWithRequest@PAGE
ldr x1, [x1, L_sel_dataTaskWithRequest@PAGEOFF]
bl _objc_msgSend
mov x21, x0 // Task
// Resume task
mov x0, x21
adrp x1, L_sel_resume@PAGE
ldr x1, [x1, L_sel_resume@PAGEOFF]
bl _objc_msgSend
Anti-analysis techniques
The following examples show techniques that can complicate reverse engineering and binary analysis.
Dynamic library subversion
Delayed loading and runtime injection
Obfuscated software can avoid static detection by loading libraries only when needed:
// Direct library load (easy to identify statically)
#include <dlfcn.h>
void* handle = dlopen("libexample.dylib", RTLD_LAZY);
// Less obvious statically: constructing the library name at runtime
char lib_name[20] = {0};
strcpy(lib_name, "lib");
strcat(lib_name, "example");
strcat(lib_name, ".dylib");
void* handle = dlopen(lib_name, RTLD_LAZY);
When disassembled, the second approach shows no obvious library names:
// Disassembly shows only strcpy/strcat calls with partial strings
adrp x0, l_buffer@PAGE
add x0, x0, l_buffer@PAGEOFF
adrp x1, l_lib@PAGE // Just "lib"
add x1, x1, l_lib@PAGEOFF
bl _strcpy
adrp x0, l_buffer@PAGE
add x0, x0, l_buffer@PAGEOFF
adrp x1, l_mal@PAGE // Just "mal"
add x1, x1, l_mal@PAGEOFF
bl _strcat
// ... more strcat calls ...
DYLD_INSERT_LIBRARIES behavior
The macOS dynamic linker can be influenced through environment variables:
# Inject library into all processes started by the shell
export DYLD_INSERT_LIBRARIES=/path/to/example.dylib
# Less obvious in static analysis:
char cmd[256];
snprintf(cmd, sizeof(cmd), "launchctl setenv DYLD_INSERT_LIBRARIES %s",
"/path/to/example.dylib");
system(cmd);
Function hooking via dynamic loader
Intercepting functions by manipulating the dynamic linker’s symbol tables:
// Hook implementation
int hooked_open(const char *path, int flags, ...) {
// Log or modify parameters
printf("Opening file: %s\n", path);
// Call original
va_list args;
va_start(args, flags);
mode_t mode = va_arg(args, mode_t);
va_end(args);
// Original function pointer obtained through dlsym
static int (*original_open)(const char*, int, ...) = NULL;
if (!original_open)
original_open = dlsym(RTLD_NEXT, "open");
return original_open(path, flags, mode);
}
Assembly after disassembly will show the calls to dlsym but not what function is being hooked:
// Complex resolution that hinders analysis
adrp x19, l_function_name@PAGE
ldr x19, [x19, l_function_name@PAGEOFF]
mov x0, #-2 // RTLD_NEXT
mov x1, x19
bl _dlsym
These techniques can complicate reverse engineering by hiding loaded libraries, obscuring function calls, or changing how analysis tools observe runtime behavior.
Why stack deallocation is necessary
Stack deallocation, which restores the stack pointer, is required for several reasons:
// Function with stack allocation and deallocation
function_example:
// Allocate 32 bytes on stack
sub sp, sp, #32
// Use stack space...
// Deallocate 32 bytes
add sp, sp, #32
ret
Memory management
- Stack exhaustion: Without deallocation, each function call would permanently consume stack space. Since the stack is a limited resource (typically just a few MB), the program would quickly run out of stack space, resulting in a stack overflow crash.
- Resource efficiency: The stack size is finite and allocated per thread. Improper deallocation wastes this limited resource.
Function call convention
- Caller/callee contract: The calling convention requires that functions restore the stack to its original state before returning. This is a fundamental contract between caller and callee.
- Predictable return state: The caller expects the stack to be in the same state after the callee returns. Violating this expectation breaks the function call mechanism.
Stack frame integrity
- Return address integrity: If the stack is not properly deallocated, the return address may not be at the expected location, causing unpredictable returns.
- Frame pointer chain: Improper stack management breaks the frame pointer chain, corrupting stack traces for debugging.
What happens without deallocation
// Problematic function that doesn't deallocate stack
bad_function:
// Allocate 32 bytes on stack
sub sp, sp, #32
// Use stack space...
// Return WITHOUT deallocating!
ret
This bad practice causes:
- Progressive stack loss: Each call to
bad_functionpermanently loses 32 bytes of stack - Eventual crash: After enough calls, a stack overflow error occurs
- Corrupted returns: If other functions are called, their return addresses will be misplaced
- Undefined behavior: The program may work for a while but exhibit increasingly erratic behavior
Stack memory vs heap memory
Unlike heap memory, which persists until explicitly freed, stack memory is tied to function execution scope:
- Heap:
malloc()allocates, memory persists untilfree()is called - Stack: Allocation is managed by stack pointer adjustment, and must be balanced within each function
- Automatic cleanup: While local variables are automatically “cleaned up” when they go out of scope, the stack pointer itself must be explicitly restored
Debugging stack issues
Stack allocation/deallocation bugs are often detected through:
- Stack corruption errors
- Unexpected program crashes
- Call stack corruptions in crash reports
- Pointer arithmetic errors with stack-based data
- Compiler warnings about frame pointer usage
In performance-critical code, functions may sometimes omit frame pointers (via compiler optimizations), but proper stack pointer management is always required for program correctness.