A long-form reference covering machine code, assembly, executable formats, memory, and binary analysis.

Low-level programming, binary analysis, and security

This reference covers the low-level concepts that show up in binary analysis work: machine code, assembly, executable formats, linking, memory, ARM64, dynamic loading, and Objective-C internals.

Fundamentals
Compilation and linking
System architecture
ARM architecture
- Execution levels
- Register organization
ARM64 assembly reference
Dynamic linking and loading
Advanced analysis techniques
Objective-C internals
Anti-analysis techniques

Fundamentals

Machine code

Machine code is the lowest-level programming language that processors directly execute. It consists of binary instructions (0s and 1s) that the CPU can understand and process directly.

Key characteristics

Pure binary format (0s and 1s)
Directly executable by the processor
Platform-specific
Not human-readable

Assembly language

Assembly language is a low-level programming language that provides a more human-readable representation of machine code. It uses mnemonics to represent machine instructions.

Example assembly instruction

LDR R3, [R2]

This instruction:

LDR (Load Register) is the mnemonic for the load instruction
R3 is the destination register
[R2] indicates that the value in R2 should be interpreted as a memory address
- The brackets [] tell the processor to use the value in R2 as a pointer to fetch data from memory

Labels and control flow

Labels serve multiple purposes:

Mark locations for jump instructions
Reference memory locations
Define entry points
Create readable code structure

Example of label usage:

_start:
    mov r1, #5
    b _start      # Branch (jump) back to _start label

Compilation and linking

Compilation pipeline

The journey from source code to executable involves several stages:

graph TD
    A[main.c] --> B[Preprocessor]
    B --> C[main.i]
    C --> D[Compiler]
    D --> E[main.s]
    E --> F[Assembler]
    F --> G[main.o]
    G --> H[Linker]
    H --> I[myapp]

Preprocessing stage

Expands #include directives
Processes macros and conditional compilation
Outputs .i files
Example: ```c // Before preprocessing #include #define MAX_SIZE 100

// After preprocessing // (contents of stdio.h) #define MAX_SIZE 100

#### Compilation stage
- Converts preprocessed C/C++ into assembly code
- Outputs `.s` files
- Example transformation:
```c
// C code
int main() {
    int x = 5;
    return x;
}

// Generated assembly
.text
.global _main
_main:
    mov x0, #5
    ret

Assembly stage

Converts assembly into object code
Outputs .o files (Mach-O format)
Uses tools like clang or as

Linking stage

Resolves external symbols
Combines object files
Creates final executable
Uses clang or ld

Executable file formats

Mach-O file format structure

Mach-O header

The Mach-O header contains important metadata about the binary:

struct mach_header {
    uint32_t magic;      // Mach magic number identifier
    cpu_type_t cputype;  // CPU type identifier
    cpu_subtype_t cpusubtype; // CPU subtype identifier
    uint32_t filetype;   // Type of file
    uint32_t ncmds;      // Number of load commands
    uint32_t sizeofcmds; // Size of all load commands
    uint32_t flags;      // Flags
};

Key Mach-O sections

Text Section (__TEXT)
- Contains executable code
- Read-only and executable
- Example:
```
.text
.global _main
_main:
  mov x0, #0
  ret
```
Data Section (__DATA)
- Initialized global variables
- Read-write permissions
- Example:
```
int global_var = 42;  // Stored in __DATA
```
BSS Section (__DATA,__bss)
- Uninitialized global variables
- Zero-initialized at runtime
- Example:
```
int uninit_var;  // Stored in __DATA,__bss
```
Read-Only Data (__TEXT,__const)
- Constant data
- String literals
- Example:
```
const char* msg = "Hello";  // Stored in __TEXT,__const
```
Thread Local Storage (__DATA,__thread_data)
- Thread-specific variables
- Example:
```
__thread int thread_var;  // Stored in __DATA,__thread_data
```

Symbol management

Symbol table structure

Symbol tables are critical data structures in executable files that connect symbolic names (such as function and variable names) to their locations in memory. They serve as lookup mechanisms for the linker and loader.

In Mach-O binaries, symbols are stored using the nlist_64 structure (for 64-bit architectures):

struct nlist_64 {
    uint32_t n_strx;   // Index into string table
    uint8_t  n_type;   // Type flag (N_EXT, N_SECT, etc.)
    uint8_t  n_sect;   // Section number where symbol is defined
    uint16_t n_desc;   // Description field (REFERENCE_FLAG_UNDEFINED_NON_LAZY, etc.)
    uint64_t n_value;  // Value (address) of the symbol
};

The key fields are:

n_strx: Index into the string table where the symbol’s name is stored
n_type: Indicates if the symbol is external, defined locally, debug information, etc.
n_sect: Section number where the symbol is defined
n_value: The actual address/value of the symbol (or offset for undefined symbols)

Symbol resolution during static linking

During compilation and static linking:

// Compiler generates code with placeholder references
int add(int a, int b);  // External symbol, to be resolved by linker

int main() {
    int result = add(5, 3);  // Reference to external function
    return result;
}

The linker:

Reads each object file’s symbol table
Resolves references against definitions
Determines final addresses for symbols
Updates all references to point to correct locations
Creates the final executable’s symbol table

Symbol table in binary analysis

The symbol table is important for binary analysis:

# View symbols in a binary
nm /usr/bin/ls

Output shows symbol types and addresses:

0000000100003f78 T _main                 # 'T' means defined, text section
                 U _malloc               # 'U' means undefined (external)
0000000100001f50 t _helper_function      # 't' means local text symbol

Strip a binary to remove symbols (anti-analysis):

strip -s binary

Dynamic symbol resolution

Global offset table (GOT) and position-independent code

Dynamic symbol resolution on ARM64 macOS/iOS uses a technique involving the Global Offset Table (GOT) and special relocations:

// Loading address from symbol stub
adrp x0, _printf@GOTPAGE
ldr x0, [x0, _printf@GOTPAGEOFF]

How this works:

What is the GOT?
- The Global Offset Table is a data structure containing addresses of external symbols
- It’s populated by the dynamic linker (dyld) at runtime
- Enables position-independent code by avoiding hardcoded addresses
The GOTPAGE/GOTPAGEOFF relocations:
- @GOTPAGE: Calculates the page address (upper bits) of the symbol’s entry in the GOT
- @GOTPAGEOFF: Calculates the offset within that page (lower bits)
- Combined, they form the complete address
Two-step loading process:
- adrp x0, _printf@GOTPAGE: Load the page address containing printf’s GOT entry into x0
- ldr x0, [x0, _printf@GOTPAGEOFF]: From that page, load the actual function address
Benefits:
- Code can be loaded at any address without modification
- External libraries can be loaded anywhere in memory
- Supports Address Space Layout Randomization (ASLR)
- Enables lazy binding (resolving symbols only when needed)

This is equivalent to asking: “Where is printf actually located?” at runtime instead of assuming a fixed address.

Position-independent code

Position-independent code (PIC) is code that can be executed regardless of where it’s loaded in memory. To understand this concept:

Simple analogy

Consider two types of maps:

Absolute map: Contains directions like “Go to 123 Main Street” - only works if buildings never change addresses.
Relative map: Contains directions like “Go 3 blocks north from where you are” - works regardless of the starting point.

PIC follows the relative model. It uses instructions that work regardless of where the code is loaded in memory.

What makes code position-dependent?

Traditional (non-PIC) code often contains:

Hardcoded memory addresses
Direct references to specific locations
Assumptions about where it will be loaded

For example, this position-dependent code assumes it knows exactly where function foo() is located:

// Position-dependent (assumes foo() is at exactly 0x12345678)
JUMP 0x12345678   // Direct jump to hardcoded address

What makes code position-independent?

PIC uses techniques to avoid hardcoded addresses:

Relative addressing (offsets from current position)
Indirection tables (GOT, PLT)
PC-relative calculations

The same example using PIC:

// Position-independent
LOAD address_of_foo from GOT  // Look up actual location
JUMP to that address          // Jump to wherever foo() actually is

Technical implementation

Position-independent code relies on these key strategies:

PC-relative addressing - Instructions reference memory locations based on the current program counter
Indirection tables - Tables of pointers that are updated by the loader
Relocation information - Extra data that helps the loader patch the code correctly

The GOTPAGE/GOTPAGEOFF example that follows shows how ARM64 uses PC-relative addressing to efficiently implement PIC on modern systems.

Why position-independent code matters

Position-independent code is fundamental to modern operating systems for several key reasons:

Shared libraries and dynamic loading
- Allows a single copy of a library to be shared among multiple running processes
- Enables libraries to be loaded at any available memory address
- Supports dynamic loading and unloading of modules at runtime
Security through ASLR
- Address Space Layout Randomization (ASLR) randomizes memory addresses to mitigate attacks
- Makes exploitation techniques like return-oriented programming (ROP) much harder
- Without PIC, ASLR would be impossible or severely limited
- Prevents attackers from reliably predicting memory addresses of code/data
Efficient memory usage
- Multiple processes can use the same physical memory for shared library code
- Only data sections need separate copies per process
- Significantly reduces overall system memory footprint
Modern OS requirements
- macOS, iOS, Linux, and other modern systems require PIC for executables and libraries
- Code signing and security features rely on code not being modified during loading
- 64-bit environments practically mandate PIC due to the vast address space
Compatibility with hardware features
- Modern CPU architectures like ARM64 are designed with PIC in mind
- Special relocation types (like GOTPAGE/GOTPAGEOFF) optimize PIC performance

Without position-independent code, modern memory protection, shared library loading, and flexible dynamic loading would be much harder to implement.

Detailed GOTPAGE/GOTPAGEOFF example

The following example shows what happens when a binary calls a library function such as strlen from libc:

// Code in the binary
adrp x0, _strlen@GOTPAGE    // Get page address of strlen's GOT entry
ldr  x0, [x0, _strlen@GOTPAGEOFF]  // Load actual address from GOT
mov  x1, x20                // Move string pointer to x1
blr  x0                     // Call strlen indirectly

Memory layout example

Consider this memory layout:

The code is loaded at address 0x100008000
The GOT is at address 0x100010000
The strlen entry in the GOT is at 0x100010088
The actual strlen function is at 0x7fff2037a4b0 (in libc)

Runtime behavior

ADRP instruction (adrp x0, _strlen@GOTPAGE):
- The assembler calculates: GOT address for strlen is 0x100010088
- The page address (4KB-aligned) of this GOT entry is 0x100010000
- The ADRP instruction loads 0x100010000 into x0
LDR instruction (ldr x0, [x0, _strlen@GOTPAGEOFF]):
- The offset within the page is 0x88 (0x100010088 - 0x100010000)
- The instruction reads memory at address 0x100010000 + 0x88
- This loads the value stored at GOT entry: 0x7fff2037a4b0 (actual strlen address)
- Now x0 contains the real address of strlen
Function call (blr x0):
- Calls the function at the address in x0
- This jumps to the actual strlen implementation at 0x7fff2037a4b0

Disassembled view

A disassembler such as Hopper might show:

0x100008000:  adrp   x0, #0x100010000            ; _strlen@GOTPAGE
0x100008004:  ldr    x0, [x0, #0x88]             ; _strlen@GOTPAGEOFF
0x100008008:  mov    x1, x20
0x10000800C:  blr    x0

Advantages of this approach

Relocation-free code: The actual strlen address (0x7fff2037a4b0) never appears in the code
ASLR support: If libc loads at a different address next time, only the GOT entry changes
Lazy binding: The GOT entry can initially point to a resolver function, filled in on first use
Efficient: ARM64’s ADRP/LDR combo is optimized for exactly this use case

This is more efficient than older approaches that required multiple instructions or PC-relative addressing with limited range.

Advanced dyld symbol resolution examples

Lazy binding

In lazy binding, the symbol is resolved only when first called, improving startup performance:

// First call to an external function (e.g., NSLog)
// 1. Jump to stub
bl _NSLog
// Stub implementation (generated by the linker)
_NSLog:
    // Jump to dyld_stub_binder which will resolve the actual address
    adrp x16, ___dyld_stub_binder@GOTPAGE
    ldr x16, [x16, ___dyld_stub_binder@GOTPAGEOFF]
    br x16

After the first call, dyld patches the stub to directly jump to the resolved function address:

// Second call to the same function
bl _NSLog
// Patched stub now jumps directly to implementation
_NSLog:
    b   0x100007fb0  // Address of actual NSLog implementation

Symbol interposition

macOS allows interposing symbols (overriding library functions), used by tools like DYLD_INSERT_LIBRARIES:

// Add to interpose.c
#include <stdio.h>

// Replacement for malloc
void* my_malloc(size_t size) {
    printf("Intercepted malloc(%zu)\n", size);
    // Call the original malloc
    return malloc(size);
}

// Interpose structure
static const struct { void *replacement; void *original; } _interposers[]
    __attribute__((section("__DATA,__interpose"))) = {
        { (void *)my_malloc, (void *)malloc }
    };

After compilation and linking, dyld will redirect calls to malloc to my_malloc.

Framework symbol resolution

Resolving symbols from frameworks often involves image lookup:

// Load framework symbol
adrp x0, __NSSearchPathForDirectoriesInDomains@GOTPAGE
ldr x0, [x0, __NSSearchPathForDirectoriesInDomains@GOTPAGEOFF]

This pattern involves dyld’s two-level namespace, where each symbol reference includes both the symbol name and its originating image (library or framework).

Program initialization

The program startup sequence on macOS:

dyld loads the program into memory
Initializes global variables
Sets up thread-local storage
Resolves dynamic symbols
Calls _main
Program execution begins

Example of startup code:

.global _main
_main:
    // Set up stack frame
    stp x29, x30, [sp, #-16]!
    mov x29, sp

    // Call main function
    bl _main

    // Restore stack frame
    ldp x29, x30, [sp], #16
    ret

System architecture

Operating system concepts

Privilege levels and memory access

Kernel mode vs user mode

Kernel Mode (EL1)

Full system access
Can read/write/execute any process memory
Critical operations can affect entire system

Example of kernel mode operation:

// Kernel mode code (simplified)
void kernel_memory_access(void* addr, size_t size) {
  // Direct memory access without checks
  memcpy(dest, addr, size);
}

User Mode (EL0)

Sandboxed environment
Isolated address space
System access through APIs

Example of user mode operation:

// User mode code
void user_memory_access(void* addr, size_t size) {
  // Must use system calls for privileged operations
  if (syscall(SYS_mprotect, addr, size, PROT_READ | PROT_WRITE) == -1) {
      // Handle error
  }
}

Process management

Process creation and identification

// Process creation example
pid_t pid = fork();
if (pid == 0) {
    // Child process
    printf("Child PID: %d\n", getpid());
} else {
    // Parent process
    printf("Parent PID: %d\n", getpid());
}

System calls

// System call example
#include <syscall.h>

int main() {
    // File operations through system calls
    int fd = open("file.txt", O_RDONLY);
    if (fd == -1) {
        // Handle error
    }
    close(fd);
    return 0;
}

Memory management

Virtual memory and page tables

// Memory mapping example
#include <sys/mman.h>

void* map_memory(size_t size) {
    void* addr = mmap(NULL, size,
                     PROT_READ | PROT_WRITE,
                     MAP_PRIVATE | MAP_ANONYMOUS,
                     -1, 0);
    if (addr == MAP_FAILED) {
        // Handle error
    }
    return addr;
}

Memory protection

// Memory protection example
int protect_memory(void* addr, size_t size) {
    // RWX permissions
    return mprotect(addr, size,
                   PROT_READ | PROT_WRITE | PROT_EXEC);
}

Heap vs stack memory management

graph TD
    A[Memory] --> B[Stack]
    A --> C[Heap]
    B --> D[Automatic allocation]
    B --> E[LIFO access pattern]
    B --> F[Limited size]
    C --> G[Dynamic allocation]
    C --> H[Random access pattern]
    C --> I[Larger size]

Stack memory characteristics

Fast allocation (just move stack pointer)
Automatic cleanup when function returns
Limited in size (typically few MB)
LIFO (Last In, First Out) access pattern
Each thread has its own stack
Used for: local variables, function parameters, return addresses

Heap memory characteristics

Dynamic lifetime (independent of function scope)
Slower allocation (requires memory management algorithm)
Manual cleanup responsibility (or via garbage collection/ARC)
Much larger capacity than stack
Shared across threads (requires synchronization)
Used for: objects, dynamic arrays, data structures of unknown size

Objective-C heap management

Manual allocation patterns (pre-ARC)

// Create an object
NSObject *obj = [[NSObject alloc] init];

// Use the object
NSString *description = [obj description];

// Release when done
[obj release];  // Decrements reference count

// Alternative pattern with autorelease
NSObject *obj2 = [[[NSObject alloc] init] autorelease];
// obj2 will be released when the current autorelease pool drains

Assembly for manual reference counting:

// Allocate object
adrp x0, _OBJC_CLASS_$_NSObject@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSObject@PAGEOFF]
bl _objc_msgSend  // alloc

// Init object
mov x20, x0       // Save object pointer
adrp x1, L_sel_init@PAGE
ldr x1, [x1, L_sel_init@PAGEOFF]
bl _objc_msgSend

// Use object
// ...

// Release object
mov x0, x20
bl _objc_release

ARC (Automatic reference counting) patterns

// With ARC, the compiler inserts retain/release calls
{
    NSObject *obj = [[NSObject alloc] init];
    // Use the object
    self.property = obj;
    // No explicit release needed, compiler inserts it
}

ARC makes these transformations:

Tracks object ownership throughout scope
Inserts retain when storing objects in properties/collections
Inserts release at end of scope
Adds autorelease when returning objects from methods

Memory management best practices

// Use strong/weak references appropriately
@property (nonatomic, strong) NSObject *strongRef;  // Owns object
@property (nonatomic, weak) NSObject *weakRef;      // Doesn't own object

// Break retain cycles with weak references
@implementation Parent
@property (nonatomic, strong) Child *child;  // Strong reference
@end

@implementation Child
@property (nonatomic, weak) Parent *parent;  // Weak reference to break cycle
@end

// Use autorelease pools for temporary objects
@autoreleasepool {
    for (int i = 0; i < 10000; i++) {
        NSNumber *num = @(i);  // Autoreleased object
        // Process num
    }
}  // Pool drained, all autoreleased objects freed

Common memory management issues

Memory leaks

// Memory leak: Creating objects without releasing
- (void)leakExample {
    NSMutableArray *array = [[NSMutableArray alloc] init];
    [array addObject:@"item"];
    // array never released in non-ARC code

    // Even with ARC, this can leak:
    self.observer = [[NSNotificationCenter defaultCenter]
        addObserverForName:@"SomeNotification"
        object:nil
        queue:nil
        usingBlock:^(NSNotification *note) {
            [self doSomething];  // Captures self, potential cycle
        }];
    // Observer never removed
}

Use-after-free errors

// Use-after-free: Accessing freed memory
- (void)useAfterFree {
    NSObject *obj = [[NSObject alloc] init];
    [obj release];  // Object memory can be reclaimed
    NSLog(@"%@", [obj description]);  // Using freed memory (crash)
}

// With ARC, can still happen with dangling pointers
- (void)dangleExample {
    NSObject __weak *weakObj;
    @autoreleasepool {
        NSObject *obj = [[NSObject alloc] init];
        weakObj = obj;
        // obj released at end of pool
    }
    NSLog(@"%@", weakObj);  // nil if lucky, crash if unlucky
}

Over-release errors

// Over-release: Releasing more than retaining
- (void)overRelease {
    NSObject *obj = [[NSObject alloc] init];
    [obj release];  // Correct
    [obj release];  // Over-release, will crash
}

Debugging memory issues

Instruments with Leaks template for finding memory leaks
Zombies mode to detect use-after-free errors
Address Sanitizer (ASan) for detecting memory errors
Malloc Stack logging to track allocation/deallocation

Thread management

Thread creation and stack

#include <pthread.h>

void* thread_function(void* arg) {
    // Thread-local variables
    int local_var = 42;

    // Stack operations
    char stack_buffer[1024];

    return NULL;
}

int main() {
    pthread_t thread;
    pthread_create(&thread, NULL, thread_function, NULL);
    pthread_join(thread, NULL);
    return 0;
}

ARM architecture

Execution levels

graph TD
    A[EL3: Secure Monitor] --> B[EL2: Hypervisor]
    B --> C[EL1: Kernel]
    C --> D[EL0: User]

EL0 (User Mode)
- Ordinary applications
- Least privileged level
- Restricted system access
EL1 (Kernel Mode)
- Operating system kernels
- Device drivers
- Full system access
EL2 (Hypervisor)
- Virtual machine management
- Resource partitioning
- Virtualization support
EL3 (Secure Monitor)
- TrustZone operations
- Secure world management
- Security state transitions

Register organization

General purpose registers

// Register usage example
.global _register_demo
_register_demo:
    // Argument registers (X0-X7)
    mov x0, #1      // First argument
    mov x1, #2      // Second argument

    // Caller-saved registers (X9-X15)
    mov x9, #42     // Temporary value
    bl _some_function

    // Callee-saved registers (X19-X28)
    stp x19, x20, [sp, #-16]!  // Must preserve these
    mov x19, #100
    ldp x19, x20, [sp], #16

    ret

Special registers

// Special register usage
.global _special_registers
_special_registers:
    // Frame Pointer (X29)
    mov x29, sp     // Set up frame pointer

    // Link Register (X30)
    bl _function    // X30 automatically set to return address

    // Stack Pointer (SP)
    sub sp, sp, #16 // Must maintain 16-byte alignment

    // Zero Register
    mov x0, xzr     // Clear register using zero register

    ret

ARM64 assembly reference

This section explains common ARM64 assembly instructions used throughout this reference. Understanding these instructions helps with binary analysis and reverse engineering.

Register usage

General purpose registers

ARM64 provides 31 general-purpose registers (x0-x30):

X0-x7: Parameter/result registers
- Used to pass arguments to functions
- x0 holds the return value from functions
- Can be freely modified by called functions (caller-saved)
X8: Indirect result location register
- Used for returning structures larger than 16 bytes
X9-x15: Temporary registers
- Caller-saved registers (not preserved across function calls)
- Function can use these without saving their previous values
X16-x17: Intra-procedure-call scratch registers
- Used by linker for PLT stubs
- Should not be used in user code across function calls
X18: Platform register (reserved)
- Often reserved for platform-specific purposes
X19-x28: Callee-saved registers
- Must be preserved by functions that use them
- If used, their values must be saved and restored
X29/fp: Frame pointer
- Points to the current function’s stack frame
- Used for accessing local variables and saved registers
X30/lr: Link register
- Holds the return address during function calls
- bl instruction automatically sets this register

Special registers

Sp: Stack pointer
- Points to the current top of the stack
- Stack grows downward (toward lower addresses)
- Must maintain 16-byte alignment
Pc: Program counter
- Contains address of current instruction
- Not directly accessible in most instructions
Xzr: Zero register
- Always reads as 0
- Writes are discarded

Memory operations

Load and store instructions

ldr x0, [x1]            // Load 64-bit value from memory address in x1 into x0

ldr (Load register): Reads data from memory into a register
[x1] means “use the value in x1 as a memory address”
The brackets [] indicate indirection (accessing memory)
This instruction reads 8 bytes (64 bits) from address x1

ldr w0, [x1]            // Load 32-bit value from memory address in x1 into w0

w0 refers to the lower 32 bits of x0
When loading to a W register, the upper 32 bits of the X register are zeroed

str x0, [x1]            // Store 64-bit value from x0 to memory address in x1

str (Store register): Writes data from a register to memory
This instruction writes 8 bytes (64 bits) to the address in x1

ldr x0, [x1, #16]       // Load from address (x1+16)

Adds an immediate offset (16) to the base address
Useful for accessing struct fields or array elements

ldr x0, [x1, x2]        // Load from address (x1+x2)

Adds a register offset (x2) to the base address
Good for array indexing with variable indices

Advanced memory addressing

ldr x0, [x1, #16]!      // Pre-index: Update x1 to x1+16, then load from new address

! indicates pre-indexing (address is updated before access)
First updates x1 to x1+16, then loads from that address
After execution, x1 contains the new address (x1+16)

ldr x0, [x1], #16       // Post-index: Load from x1, then update x1 to x1+16

Post-indexing (address is updated after access)
First loads from the address in x1, then updates x1 to x1+16
After execution, x1 contains the new address (x1+16)

Pair operations

ldp x0, x1, [sp, #16]   // Load pair: Load 16 bytes from [sp+16] into x0,x1

ldp (Load pair): Loads two consecutive registers from memory
Efficient for loading 16 bytes at once
Commonly used in function prologues/epilogues

stp x0, x1, [sp, #-16]! // Store pair with pre-decrement

stp (Store pair): Stores two consecutive registers to memory
With ! and negative offset: allocates space on stack then stores
Critical for function prologues (saving registers)

Arithmetic and logic

Basic arithmetic

add x0, x1, x2          // x0 = x1 + x2

Adds values in x1 and x2, stores result in x0
Does not affect either source register

add x0, x0, #1          // x0 = x0 + 1 (increment x0)

Immediate variant: adds a constant value (#1)
Used for incrementing counters, pointer arithmetic

sub x0, x1, x2          // x0 = x1 - x2

Subtracts x2 from x1, stores result in x0

sub sp, sp, #16         // SP = SP - 16 (allocate 16 bytes on stack)

When used with SP: allocates space on the stack
The stack grows downward, so subtraction = allocation

add sp, sp, #16         // SP = SP + 16 (deallocate 16 bytes from stack)

When used with SP: deallocates stack space
This is the “restore stack” operation
Adds bytes to the stack pointer, moving it up to its previous position
If SP was 0xFF0, it returns to 0x1000
The memory isn’t cleared, but the pointer moves so the space is available for reuse

Multiply and divide

mul x0, x1, x2          // x0 = x1 * x2

Multiplies x1 by x2, stores result in x0

udiv x0, x1, x2         // x0 = x1 / x2 (unsigned division)

Divides x1 by x2 (unsigned), stores result in x0

msub x0, x1, x2, x3     // x0 = x3 - (x1 * x2)

Multiply-subtract: multiplies x1 by x2, then subtracts from x3

Bitwise operations

and x0, x1, x2          // x0 = x1 & x2 (bitwise AND)

Performs logical AND on each bit
Used for masking bits (extracting specific bit fields)

orr x0, x1, x2          // x0 = x1 | x2 (bitwise OR)

Performs logical OR on each bit
Used for setting specific bits

eor x0, x1, x2          // x0 = x1 ^ x2 (bitwise XOR)

Performs exclusive OR on each bit
Useful for toggling bits, checking for changes

mvn x0, x1              // x0 = ~x1 (bitwise NOT)

Inverts all bits in x1

Shifts and rotates

lsl x0, x1, #2          // x0 = x1 << 2 (logical shift left by 2 bits)

Shifts all bits left, filling with zeros
Equivalent to multiplying by 2^n (here 2^2 = 4)

lsr x0, x1, #3          // x0 = x1 >> 3 (logical shift right by 3 bits)

Shifts all bits right, filling with zeros
For unsigned values, equivalent to dividing by 2^n

asr x0, x1, #2          // x0 = x1 >> 2 (arithmetic shift right)

Shifts right but preserves sign bit
For signed values, equivalent to dividing by 2^n

Control flow

Branches

b label                 // Branch to label (unconditional jump)

Changes execution to the instruction at label
Uses PC-relative addressing (offset from current PC)

bl function             // Branch with link to function

Branches to function address
Stores return address in link register (x30/lr)
This is how functions are called in ARM64

blr x0                  // Branch with link to address in x0

Branches to the address stored in register x0
Stores return address in link register
Used for function pointers, virtual methods

ret                     // Return from subroutine

Returns to address stored in x30/lr
Usually the last instruction in a function

Conditional branches

cmp x0, x1              // Compare x0 with x1 (sets condition flags)

Subtracts x1 from x0 without storing the result
Sets condition flags (zero, negative, carry, overflow)
Used before conditional branches

beq label               // Branch if equal (if Z flag set)

Branches to label if the result of comparison was equal
Checks the Zero flag set by previous comparison

bne label               // Branch if not equal (if Z flag clear)

Branches if the compared values were not equal

bgt label               // Branch if greater than (signed)

Branches if first value was greater than second
For signed comparisons (treats values as signed integers)

blt label               // Branch if less than (signed)

Branches if first value was less than second

b.eq label              // Alternate syntax for beq (newer syntax)

Same as beq but using newer ARM64 syntax
Preferred in newer code

Advanced control

cbz x0, label           // Compare and Branch if Zero

If x0 equals zero, branch to label
More efficient than separate cmp + beq

cbnz x0, label          // Compare and Branch if Not Zero

If x0 is not zero, branch to label
Used for null pointer checks, loop condition tests

Stack operations

Basic stack usage

// Simple stack sequence
sub sp, sp, #16         // Allocate 16 bytes on stack
str x0, [sp, #8]        // Store x0 at offset 8 from sp
str x1, [sp]            // Store x1 at offset 0 from sp
// ... code using stack values ...
ldr x0, [sp, #8]        // Restore x0 from stack
ldr x1, [sp]            // Restore x1 from stack
add sp, sp, #16         // Deallocate 16 bytes (restore stack)

The above sequence:

Allocates 16 bytes of stack space by decreasing SP
Stores registers x0 and x1 onto the stack
Later retrieves values back into registers
Deallocates stack space by increasing SP

Understanding stack operations in detail

What is the stack pointer (SP)?

The stack pointer (sp) is a special register that points to the current “top” of the stack. In ARM64:

The stack grows downward in memory (from higher to lower addresses)
sp always points to the last item pushed onto the stack
The stack is a region of memory used for temporary storage of data like:
- Local variables
- Return addresses
- Saved registers
- Function parameters that don’t fit in registers

Allocating stack space

When you allocate bytes on the stack using:

sub sp, sp, #16         // Allocate 16 bytes

What happens:

The stack pointer value is decreased by 16 bytes
This creates 16 bytes of new “space” on the stack
No memory is actually modified - just the pointer moves
This space is now available for your function to use

Visual example

Memory     Before        After
Address    Allocation    Allocation
--------   -----------   -----------
0x1000     <- SP
0x0FF8
0x0FF0                   <- SP

Storing at an offset from SP

When you store data using an offset:

str x0, [sp, #8]        // Store x0 at SP+8

What happens:

The address is calculated: SP + 8
The value in register x0 is written to that memory address
The stack pointer itself doesn’t move

Visual example (after allocating 16 bytes)

Memory     Contents
Address    After Operations
--------   ---------------
0x1000     (previous data)
0x0FF8     [x0's value]     <- SP+8
0x0FF0     (unused)         <- SP

Complete stack example

The following example allocates stack space, uses it, and then deallocates it:

my_function:
    // Prologue: allocate 16 bytes on stack
    sub sp, sp, #16         // SP = SP - 16

    // Store two registers on stack
    str x0, [sp, #8]        // Store first parameter at SP+8
    str x1, [sp]            // Store second parameter at SP

    // Do some work with the parameters...
    ldr x0, [sp, #8]        // Load first parameter back into x0
    add x0, x0, #5          // Add 5 to it

    // Store result at SP+8
    str x0, [sp, #8]

    // Load results from stack
    ldr x0, [sp, #8]        // Load result into return register

    // Epilogue: deallocate stack space
    add sp, sp, #16         // SP = SP + 16
    ret                     // Return to caller

In this example:

The function subtracts 16 from SP to allocate space.
It stores parameters at specific offsets (SP+0 and SP+8).
It performs calculations using those values.
It stores the result back to the stack.
It loads the result into x0, the return value register.
It deallocates the stack space by adding 16 to SP.
It returns to the caller.

Offsets from SP organize the stack frame. Each value has a predictable location, so later instructions can load it by offset.

Function prologue and epilogue

// Function prologue (standard pattern)
stp x29, x30, [sp, #-16]!  // Save FP and LR, allocate 16 bytes
mov x29, sp                // Set up frame pointer
sub sp, sp, #32            // Allocate 32 bytes for local variables

// Function body...

// Function epilogue (standard pattern)
add sp, sp, #32            // Deallocate locals
ldp x29, x30, [sp], #16    // Restore FP and LR, deallocate 16 bytes
ret                        // Return to caller

This pattern:

Saves frame pointer and return address
Establishes new frame pointer
Allocates space for local variables
When done, reverses these operations in reverse order
Returns to caller

Advanced stack alignment

and sp, sp, #-16           // Ensure 16-byte stack alignment

Bitwise ANDs SP with -16 (0xFFFFFFF0)
Clears the lower 4 bits (rounds down to nearest 16)
ARM64 requires SP to be 16-byte aligned at all times

SIMD and floating point

Floating point registers

ARM64 has 32 floating-point/SIMD registers (v0-v31):

Can be accessed as:
- b0-b31: 8-bit values
- h0-h31: 16-bit values
- s0-s31: 32-bit values
- d0-d31: 64-bit values
- q0-q31: 128-bit values

Floating point load/store

ldr s0, [x0]            // Load 32-bit float from address in x0 into s0

Loads a single-precision float

ldr d0, [x0]            // Load 64-bit double from address in x0 into d0

Loads a double-precision float

Floating point arithmetic

fadd d0, d1, d2         // d0 = d1 + d2 (double precision)

Adds two double-precision floating point values

fmul s0, s1, s2         // s0 = s1 * s2 (single precision)

Multiplies two single-precision floating point values

SIMD instructions

fmov d0, #1.0           // Move immediate float value 1.0 into d0

Loads an immediate floating point value into a register

add v0.4s, v1.4s, v2.4s // Add four 32-bit integers in parallel

Adds four 32-bit lanes from v1 and v2, stores in v0
Processes multiple elements in a single instruction

Dynamic linking and loading

Dynamic linker operation

The dynamic linker (dyld) on macOS and iOS is responsible for loading and preparing executable code at runtime.

The role of dyld (dynamic link editor)

The loading process follows these key steps:

Load the Main Executable
- Binary mapped into memory
- Headers parsed to identify dependencies
- Path to shared libraries determined
Load Dependent Libraries
- Recursive loading of all dependent libraries
- Resolution of library search paths
- Library validation (code signing, permissions)
Perform Relocations
- Fixup all addresses based on actual load locations
- Update references to match the random memory layout
- Prepare for symbol resolution
Bind Symbols
- Resolve external function and data references
- Fill in address tables (GOT, etc.)
- Implement lazy/non-lazy binding as needed
Initialize Libraries
- Run library initialization code in dependency order
- Execute static constructors
- Prepare runtime environment

Simplified dyld implementation

The following pseudocode shows a simplified dyld loading flow:

void dyld_main(const macho_header* mainExecutable) {
    // Map main executable
    mapMainExecutable(mainExecutable);

    // Load dependent libraries recursively
    loadDependentLibraries(mainExecutable);

    // Perform relocations for Position Independent Code
    applyRelocations(mainExecutable);

    // Bind external symbols
    bindSymbols(mainExecutable);

    // Run initialization routines
    runInitializers(mainExecutable);

    // Jump to executable's entry point
    call_main(mainExecutable->entrypoint);
}

Dynamic linker control variables

The dynamic linker’s behavior can be controlled through environment variables:

# Display dyld steps during loading
export DYLD_PRINT_APIS=1

# Print loaded images
export DYLD_PRINT_LIBRARIES=1

# Modify library search paths
export DYLD_LIBRARY_PATH=/path/to/custom/libs

# Insert libraries into processes
export DYLD_INSERT_LIBRARIES=/path/to/libhook.dylib

Symbol resolution techniques

Runtime symbol lookup

When dyld (the dynamic linker) loads an executable, it performs the following process to resolve symbols:

// Simplified implementation of how dyld resolves symbols
Symbol* dyld_lookup_symbol(const char* symbolName, ImageLoader* fromImage) {
    // Get string table and symbol table
    const char* stringTable = fromImage->getStringTable();
    const struct nlist_64* symbolTable = fromImage->getSymbolTable();
    uint32_t symbolCount = fromImage->getSymbolCount();

    // Iterate through symbols
    for (uint32_t i = 0; i < symbolCount; i++) {
        const struct nlist_64* symbol = &symbolTable[i];

        // Get symbol name from string table using n_strx
        const char* name = &stringTable[symbol->n_strx];

        // Check if matches target symbol
        if (strcmp(name, symbolName) == 0) {
            // Check if symbol is exported (using n_type field)
            if ((symbol->n_type & N_EXT) && !(symbol->n_type & N_STAB)) {
                // Calculate actual address using n_value
                return calculateActualAddress(symbol->n_value, symbol->n_sect);
            }
        }
    }

    return NULL; // Symbol not found
}

Lazy vs. non-lazy symbol binding

Mach-O uses different binding strategies for efficiency:

// Non-lazy binding: Symbols resolved at load time
// Used for data references that must be valid immediately
extern NSString *const kImportantConstant;  // Non-lazy binding

// Lazy binding: Symbols resolved on first use
// Used for function calls to minimize startup time
void someRarelyCalledFunction(void);  // Lazy binding

The actual binding mechanism uses stub code:

// Lazy binding stub for external function call
_external_function_stub:
    adrp x16, _lazy_binding_info@PAGE     // Load binding info address
    ldr  x16, [x16, _lazy_binding_info@PAGEOFF]
    br   x16                              // Jump to dyld_stub_binder

When first called, dyld:

Resolves the symbol using the symbol tables
Updates the stub to directly point to the target function
Future calls go directly to the target without dyld intervention

Bind opcodes

The Mach-O format uses a specialized format called “bind opcodes” to encode symbol binding information. This compact representation tells the dynamic linker how to resolve symbols:

// Binding opcodes found in LC_DYLD_INFO_ONLY load command
// Example interpretation:
BIND_OPCODE_SET_DYLIB_ORDINAL(1)    // symbol from dylib index 1
BIND_OPCODE_SET_SYMBOL_TRAILING_FLAGS_IMM(0, "_printf")  // symbol name
BIND_OPCODE_SET_TYPE_IMM(BIND_TYPE_POINTER)  // binding type
BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB(1, 0x1000)  // location
BIND_OPCODE_DO_BIND()  // perform binding

Resolution process

Symbol resolution goes through these stages:

Symbol Lookup
- Search in flat namespace or two-level namespace
- Find defining library for symbol
Address Resolution
- Determine actual address of symbol
- Account for library base address
GOT/Stub Patching
- Update function address tables
- Patch stubs for direct calls after first resolution

Two-level namespace

macOS uses a two-level namespace to avoid symbol conflicts:

struct two_level_hint {
    uint32_t library_ordinal : 8,
             symbol_index    : 24;
};

Each symbol reference includes both the symbol name and the library identifier, preventing collisions across libraries with identical symbol names.

The implementation considers both symbol name and library ordinal:

// Two-level namespace resolution
Symbol* lookupTwoLevelNamespace(const char* name, int libraryOrdinal) {
    // Get the specific library for this ordinal
    ImageLoader* library = getLibraryForOrdinal(libraryOrdinal);
    if (!library)
        return NULL;

    // Look only in that specific library
    return library->findExportedSymbol(name);
}

Runtime symbol introspection

The symbol table enables runtime API lookups:

// Using dlsym to find a symbol at runtime
void* function_pointer = dlsym(RTLD_DEFAULT, "functionName");
if (function_pointer) {
    // Cast and call the function
    void (*function)(void) = (void (*)(void))function_pointer;
    function();
}

At assembly level, dlsym searches through symbol tables:

// Call dlsym to find function
adrp x0, L_RTLD_DEFAULT@PAGE    // RTLD_DEFAULT handle
ldr x0, [x0, L_RTLD_DEFAULT@PAGEOFF]
adrp x1, L_function_name@PAGE   // Function name string
add x1, x1, L_function_name@PAGEOFF
bl _dlsym                       // Call dlsym
cbz x0, L_symbol_not_found      // Check if NULL
blr x0                          // Call the resolved function

Library loading sequence

Library search paths

When resolving a library reference, dyld searches in this order:

@rpath - Relative path list specified by the binary
@executable_path - Relative to the main executable
@loader_path - Relative to the loading library
DYLD_LIBRARY_PATH - Environment variable specified paths
System default paths - /usr/lib, etc.

Example of @rpath usage in objective-C

// Loading a framework at runtime using @rpath
- (BOOL)loadFramework {
    NSString *frameworkPath = @"@rpath/MyFramework.framework/MyFramework";
    void *handle = dlopen([frameworkPath UTF8String], RTLD_LAZY);
    if (!handle) {
        NSLog(@"Failed to load: %s", dlerror());
        return NO;
    }

    // Find and call initialization function
    InitFunction initFunc = (InitFunction)dlsym(handle, "InitializeFramework");
    if (initFunc) {
        return initFunc();
    }
    return NO;
}

Assembly for dynamic loading:

// Prepare UTF8String call
mov x0, x20            // NSString frameworkPath
adrp x1, L_sel_UTF8String@PAGE
ldr x1, [x1, L_sel_UTF8String@PAGEOFF]
bl _objc_msgSend
mov x1, x0             // Result of UTF8String

// Call dlopen
mov w2, #1             // RTLD_LAZY
bl _dlopen
mov x19, x0            // Save handle

// Check for NULL
cbz x19, L_error_handler

// Call dlsym to find function
mov x0, x19            // Library handle
adrp x1, L_func_name@PAGE
ldr x1, [x1, L_func_name@PAGEOFF]
bl _dlsym
mov x20, x0            // Save function pointer

// Check function pointer
cbz x20, L_no_function

// Call function
blr x20

Third-party library integration patterns

Dynamically loaded plugins

// Plugin manager implementation
@implementation PluginManager

- (NSArray<id<PluginProtocol>> *)loadPlugins {
    NSMutableArray *plugins = [NSMutableArray array];
    NSString *pluginsDir = [[NSBundle mainBundle] pathForResource:@"Plugins" ofType:nil];
    NSArray *pluginFiles = [[NSFileManager defaultManager] contentsOfDirectoryAtPath:pluginsDir error:nil];

    for (NSString *pluginName in pluginFiles) {
        if ([pluginName hasSuffix:@".bundle"]) {
            NSString *pluginPath = [pluginsDir stringByAppendingPathComponent:pluginName];
            NSBundle *pluginBundle = [NSBundle bundleWithPath:pluginPath];

            if ([pluginBundle load]) {
                // Get principal class that conforms to PluginProtocol
                Class principalClass = [pluginBundle principalClass];
                if ([principalClass conformsToProtocol:@protocol(PluginProtocol)]) {
                    id<PluginProtocol> plugin = [[principalClass alloc] init];
                    [plugins addObject:plugin];
                }
            }
        }
    }

    return plugins;
}
@end

Assembly representation of bundle loading:

// Load NSBundle class
adrp x0, _OBJC_CLASS_$_NSBundle@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSBundle@PAGEOFF]

// Call bundleWithPath:
mov x2, x21            // pluginPath string
adrp x1, L_sel_bundleWithPath@PAGE
ldr x1, [x1, L_sel_bundleWithPath@PAGEOFF]
bl _objc_msgSend
mov x19, x0            // Store NSBundle

// Call load method
mov x0, x19            // NSBundle instance
adrp x1, L_sel_load@PAGE
ldr x1, [x1, L_sel_load@PAGEOFF]
bl _objc_msgSend
cbz w0, L_load_failed  // Test boolean result

// Get principal class
mov x0, x19            // NSBundle instance
adrp x1, L_sel_principalClass@PAGE
ldr x1, [x1, L_sel_principalClass@PAGEOFF]
bl _objc_msgSend
mov x20, x0            // Store Class

Static library integration

// Using a statically linked library
#import "ThirdPartyLib.h"

- (void)useStaticLibrary {
    // Initialize the library
    TPLManager *manager = [TPLManager sharedManager];

    // Configure with API key
    [manager setAPIKey:@"your-api-key"];

    // Use library functionality
    TPLResult *result = [manager processData:self.inputData];

    // Handle result
    if (result.success) {
        self.outputLabel.text = result.outputString;
    }
}

Assembly pattern for static library calls:

// Get singleton instance
adrp x0, _OBJC_CLASS_$_TPLManager@PAGE
ldr x0, [x0, _OBJC_CLASS_$_TPLManager@PAGEOFF]
adrp x1, L_sel_sharedManager@PAGE
ldr x1, [x1, L_sel_sharedManager@PAGEOFF]
bl _objc_msgSend
mov x19, x0            // Store manager instance

// Set API key
mov x0, x19            // Manager instance
adrp x2, L_api_key@PAGE
ldr x2, [x2, L_api_key@PAGEOFF]  // API key string
adrp x1, L_sel_setAPIKey@PAGE
ldr x1, [x1, L_sel_setAPIKey@PAGEOFF]
bl _objc_msgSend

// Process data
mov x0, x19            // Manager instance
ldr x20, [x21, #8]     // Load self.inputData from ivar offset
mov x2, x20            // Input data
adrp x1, L_sel_processData@PAGE
ldr x1, [x1, L_sel_processData@PAGEOFF]
bl _objc_msgSend
mov x22, x0            // Store result

CocoaPods/Swift Package integration

// Using a library added via CocoaPods or Swift Package Manager
#import <Alamofire/Alamofire.h>  // Swift package
#import <AFNetworking/AFNetworking.h>  // CocoaPod

- (void)makeNetworkRequest {
    // Using AFNetworking (Objective-C library)
    AFHTTPSessionManager *manager = [AFHTTPSessionManager manager];
    [manager GET:@"https://api.example.com/data"
      parameters:nil
         headers:nil
        progress:nil
         success:^(NSURLSessionDataTask *task, id responseObject) {
             NSLog(@"JSON: %@", responseObject);
         }
         failure:^(NSURLSessionDataTask *task, NSError *error) {
             NSLog(@"Error: %@", error);
         }];

    // Using Alamofire from Swift (via bridging)
    [AlamofireWrapper requestURL:@"https://api.example.com/profile"
                      completion:^(NSDictionary *result, NSError *error) {
        if (error) {
            NSLog(@"Error: %@", error);
        } else {
            NSLog(@"Result: %@", result);
        }
    }];
}

Assembly for external library usage:

// Get AFHTTPSessionManager
adrp x0, _OBJC_CLASS_$_AFHTTPSessionManager@PAGE
ldr x0, [x0, _OBJC_CLASS_$_AFHTTPSessionManager@PAGEOFF]
adrp x1, L_sel_manager@PAGE
ldr x1, [x1, L_sel_manager@PAGEOFF]
bl _objc_msgSend
mov x19, x0            // Store manager instance

// Set up for GET call (loads URL string)
mov x0, x19            // Manager instance
adrp x2, L_url_string@PAGE
ldr x2, [x2, L_url_string@PAGEOFF]

// Set up parameters (nil)
mov x3, xzr

// Set up headers (nil)
mov x4, xzr

// Set up progress block (nil)
mov x5, xzr

// Set up success block (complex block literal setup)
// ... block setup code for success handler ...

// Set up failure block (complex block literal setup)
// ... block setup code for failure handler ...

// Call GET method
adrp x1, L_sel_GET_parameters@PAGE
ldr x1, [x1, L_sel_GET_parameters@PAGEOFF]
bl _objc_msgSend

WebSocket Communication

// WebSocket implementation using SocketRocket library
- (void)setupWebSocket {
    // Create WebSocket connection
    SRWebSocket *webSocket = [[SRWebSocket alloc] initWithURL:[NSURL URLWithString:@"wss://websocket.example.com/socket"]];
    webSocket.delegate = self;

    // Set up request headers
    NSDictionary *headers = @{@"Authorization": [NSString stringWithFormat:@"Bearer %@", self.accessToken]};
    [webSocket setDelegateOperationQueue:[NSOperationQueue mainQueue]];
    [webSocket setRequestCookies:[[NSHTTPCookieStorage sharedHTTPCookieStorage] cookiesForURL:webSocket.url]];

    // Connect
    [webSocket open];
    self.webSocket = webSocket;
}

// WebSocket delegate methods
- (void)webSocket:(SRWebSocket *)webSocket didReceiveMessage:(id)message {
    if ([message isKindOfClass:[NSString class]]) {
        // Parse JSON message
        NSError *jsonError;
        NSDictionary *jsonData = [NSJSONSerialization JSONObjectWithData:[message dataUsingEncoding:NSUTF8StringEncoding]
                                                                 options:0
                                                                   error:&jsonError];
        if (!jsonError) {
            [self handleWebSocketEvent:jsonData];
        }
    }
}

- (void)webSocket:(SRWebSocket *)webSocket didFailWithError:(NSError *)error {
    NSLog(@"WebSocket failed with error: %@", error);
    // Attempt reconnection after delay
    dispatch_after(dispatch_time(DISPATCH_TIME_NOW, 5 * NSEC_PER_SEC), dispatch_get_main_queue(), ^{
        [self setupWebSocket];
    });
}

- (void)webSocket:(SRWebSocket *)webSocket didCloseWithCode:(NSInteger)code reason:(NSString *)reason wasClean:(BOOL)wasClean {
    NSLog(@"WebSocket closed: %@", reason);
    self.webSocket = nil;
}

// Send message through WebSocket
- (void)sendEvent:(NSString *)eventType withData:(NSDictionary *)data {
    if (self.webSocket.readyState != SR_OPEN) {
        [self setupWebSocket];
        return;
    }

    NSMutableDictionary *message = [NSMutableDictionary dictionaryWithDictionary:data];
    message[@"type"] = eventType;
    message[@"timestamp"] = @(floor([[NSDate date] timeIntervalSince1970] * 1000));

    NSError *error;
    NSData *jsonData = [NSJSONSerialization dataWithJSONObject:message options:0 error:&error];
    if (!error) {
        NSString *jsonString = [[NSString alloc] initWithData:jsonData encoding:NSUTF8StringEncoding];
        [self.webSocket send:jsonString];
    }
}

Assembly pattern for WebSocket operations:

// Create WebSocket
adrp x0, _OBJC_CLASS_$_SRWebSocket@PAGE
ldr x0, [x0, _OBJC_CLASS_$_SRWebSocket@PAGEOFF]
bl _objc_msgSend      // alloc

// Create URL
adrp x20, _OBJC_CLASS_$_NSURL@PAGE
ldr x0, [x20, _OBJC_CLASS_$_NSURL@PAGEOFF]
adrp x2, L_websocket_url@PAGE
ldr x2, [x2, L_websocket_url@PAGEOFF]
adrp x1, L_sel_URLWithString@PAGE
ldr x1, [x1, L_sel_URLWithString@PAGEOFF]
bl _objc_msgSend
mov x2, x0             // URL

// Initialize WebSocket
mov x0, x21            // WebSocket (from alloc)
adrp x1, L_sel_initWithURL@PAGE
ldr x1, [x1, L_sel_initWithURL@PAGEOFF]
bl _objc_msgSend
mov x19, x0            // Store WebSocket

// Set delegate
mov x0, x19            // WebSocket
mov x2, x20            // self pointer
adrp x1, L_sel_setDelegate@PAGE
ldr x1, [x1, L_sel_setDelegate@PAGEOFF]
bl _objc_msgSend

// Open connection
mov x0, x19            // WebSocket
adrp x1, L_sel_open@PAGE
ldr x1, [x1, L_sel_open@PAGEOFF]
bl _objc_msgSend

// Store in ivar
str x19, [x20, #ivar_offset_webSocket]

GraphQL Client implementation

// GraphQL client using Apollo iOS
- (void)performGraphQLQuery {
    // Create GraphQL query
    UserProfileQuery *query = [[UserProfileQuery alloc] initWithUserId:self.userId];

    // Execute query
    [[ApolloClient shared] fetch:query
                  cachePolicy:NSURLRequestReloadIgnoringLocalCacheData
                  queue:dispatch_get_main_queue()
                  resultHandler:^(GraphQLQueryResult *result) {
        if (result.error) {
            NSLog(@"Error: %@", result.error);
            return;
        }

        // Process data
        UserProfile *profile = result.data.user;
        self.nameLabel.text = profile.name;
        self.emailLabel.text = profile.email;

        // Load avatar image
        if (profile.avatarUrl) {
            [self.imageLoader loadImageWithURL:profile.avatarUrl
                                   completion:^(UIImage *image) {
                self.avatarImageView.image = image;
            }];
        }
    }];
}

// Mutation example
- (void)updateUserProfile {
    UpdateUserProfileMutation *mutation = [[UpdateUserProfileMutation alloc]
                                          initWithUserId:self.userId
                                          name:self.nameField.text
                                          email:self.emailField.text];

    [[ApolloClient shared] perform:mutation
                       queue:dispatch_get_main_queue()
                       resultHandler:^(GraphQLMutationResult *result) {
        if (result.error) {
            [self showErrorAlert:result.error.localizedDescription];
        } else {
            [self showSuccessMessage:@"Profile updated successfully"];
        }
    }];
}

Advanced analysis techniques

Disassembly

Disassembly is the process of converting machine code back into assembly language. This involves:

Reconstructing assembly instructions from binary
Creating human-readable output
Identifying instruction boundaries
Mapping binary patterns to assembly mnemonics

Decompilation

Decompilation goes further than disassembly by:

Converting assembly into higher-level languages (C/C++)
Attempting to reconstruct original program logic
Creating more readable and maintainable code
Using intermediate representations (IR)

Binary analysis tools

Binary Ninja
- Modern interface
- Powerful analysis capabilities
- Support for multiple architectures
Ghidra
- Open-source
- Developed by NSA
- Extensive plugin ecosystem
Hopper
- User-friendly interface
- Good for macOS and iOS analysis
- Quick analysis capabilities

Applications of binary analysis

Binary analysis tools are useful for:

Reverse engineering
Security research
Vulnerability analysis
Understanding closed-source software
Debugging and troubleshooting

Objective-C internals

Object creation

Creating a new object

// Objective-C code
NSString *str = [[NSString alloc] initWithFormat:@"Hello, %@", name];

// Resulting assembly pattern
// Load NSString class reference
adrp x0, _OBJC_CLASS_$_NSString@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSString@PAGEOFF]

// Call +[NSString alloc]
bl _objc_msgSend  // Selector is "alloc"

// Load format string and arguments
adrp x1, l_fmt@PAGE     // Format string address
ldr x1, [x1, l_fmt@PAGEOFF]
mov x2, x20             // 'name' variable

// Call -[NSString initWithFormat:]
adrp x3, l_selector_initWithFormat@PAGE
ldr x1, [x3, l_selector_initWithFormat@PAGEOFF]
bl _objc_msgSend

Method dispatch

Instance method call

// Objective-C code
[myObject performAction:value];

// Resulting assembly pattern
mov x0, x19            // Load 'myObject' pointer
adrp x2, l_value@PAGE  // Load 'value' parameter
ldr x2, [x2, l_value@PAGEOFF]

// Load selector
adrp x3, l_selector_performAction@PAGE
ldr x1, [x3, l_selector_performAction@PAGEOFF]

// Dynamic dispatch
bl _objc_msgSend

Property access

// Objective-C code
NSInteger count = self.itemCount;

// Resulting assembly pattern
mov x0, x19            // 'self' pointer
adrp x3, l_selector_itemCount@PAGE
ldr x1, [x3, l_selector_itemCount@PAGEOFF]
bl _objc_msgSend       // Calls getter method

Memory management patterns

ARC (Automatic reference counting)

// Objective-C code
{
    NSObject *temp = [[NSObject alloc] init];
    [self doSomethingWith:temp];
}  // temp is released automatically

// Resulting assembly pattern
// Object creation
adrp x0, _OBJC_CLASS_$_NSObject@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSObject@PAGEOFF]
bl _objc_msgSend      // alloc

// Init
mov x20, x0           // Save the object
adrp x1, l_selector_init@PAGE
ldr x1, [x1, l_selector_init@PAGEOFF]
bl _objc_msgSend

// Use object
mov x0, x19           // self
mov x2, x20           // temp object
adrp x1, l_selector_doSomethingWith@PAGE
ldr x1, [x1, l_selector_doSomethingWith@PAGEOFF]
bl _objc_msgSend

// Implicit release at scope end
mov x0, x20
bl _objc_release

Common patterns

File I/O operations

// Objective-C code
NSData *data = [@"Hello" dataUsingEncoding:NSUTF8StringEncoding];
[data writeToFile:@"/path/file.txt" atomically:YES];

// Resulting assembly pattern
// Creating the NSData object
adrp x0, l_string_Hello@PAGE
ldr x0, [x0, l_string_Hello@PAGEOFF]
mov w2, #4            // NSUTF8StringEncoding value
adrp x1, l_selector_dataUsingEncoding@PAGE
ldr x1, [x1, l_selector_dataUsingEncoding@PAGEOFF]
bl _objc_msgSend
mov x19, x0           // Store NSData result

// Prepare arguments for writeToFile method
mov x0, x19           // NSData object
adrp x2, l_path_string@PAGE
ldr x2, [x2, l_path_string@PAGEOFF]
mov w3, #1            // YES for atomically parameter
adrp x1, l_selector_writeToFile_atomically@PAGE
ldr x1, [x1, l_selector_writeToFile_atomically@PAGEOFF]
bl _objc_msgSend

User interface operations

// Objective-C code
UIView *view = [[UIView alloc] initWithFrame:CGRectMake(0, 0, 100, 100)];
view.backgroundColor = [UIColor redColor];

// Resulting assembly pattern
// Load UIView class
adrp x0, _OBJC_CLASS_$_UIView@PAGE
ldr x0, [x0, _OBJC_CLASS_$_UIView@PAGEOFF]
bl _objc_msgSend     // alloc

// Prepare frame values (CGRectMake)
fmov d0, #0.0        // x = 0
fmov d1, #0.0        // y = 0
fmov d2, #100.0      // width = 100
fmov d3, #100.0      // height = 100

// Call initWithFrame:
mov x20, x0          // Store UIView instance
adrp x1, l_selector_initWithFrame@PAGE
ldr x1, [x1, l_selector_initWithFrame@PAGEOFF]
bl _objc_msgSend
mov x19, x0          // Store result

// Get UIColor redColor
adrp x0, _OBJC_CLASS_$_UIColor@PAGE
ldr x0, [x0, _OBJC_CLASS_$_UIColor@PAGEOFF]
adrp x1, l_selector_redColor@PAGE
ldr x1, [x1, l_selector_redColor@PAGEOFF]
bl _objc_msgSend
mov x20, x0         // Store UIColor

// Set backgroundColor property
mov x0, x19         // UIView instance
adrp x1, l_selector_setBackgroundColor@PAGE
ldr x1, [x1, l_selector_setBackgroundColor@PAGEOFF]
mov x2, x20         // UIColor instance
bl _objc_msgSend

Networking operations

// Objective-C code
NSURL *url = [NSURL URLWithString:@"https://example.com"];
NSURLRequest *request = [NSURLRequest requestWithURL:url];
NSURLSession *session = [NSURLSession sharedSession];
NSURLSessionDataTask *task = [session dataTaskWithRequest:request
                                       completionHandler:^(NSData *data, NSURLResponse *response, NSError *error) {
    // Handle response
}];
[task resume];

// Resulting assembly pattern
// Create NSURL
adrp x0, _OBJC_CLASS_$_NSURL@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSURL@PAGEOFF]
adrp x2, l_url_string@PAGE
ldr x2, [x2, l_url_string@PAGEOFF]
adrp x1, l_selector_URLWithString@PAGE
ldr x1, [x1, l_selector_URLWithString@PAGEOFF]
bl _objc_msgSend
mov x2, x0             // URL for request

// Create NSURLRequest
adrp x0, _OBJC_CLASS_$_NSURLRequest@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSURLRequest@PAGEOFF]
mov x2, x20         // NSURL
adrp x1, l_selector_requestWithURL@PAGE
ldr x1, [x1, l_selector_requestWithURL@PAGEOFF]
bl _objc_msgSend
mov x20, x0         // Store NSURLRequest

// Get shared session
adrp x0, _OBJC_CLASS_$_NSURLSession@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSURLSession@PAGEOFF]
adrp x1, l_selector_sharedSession@PAGE
ldr x1, [x1, l_selector_sharedSession@PAGEOFF]
bl _objc_msgSend
mov x21, x0         // Store NSURLSession

// Create data task with request and completion block
mov x0, x21         // NSURLSession
mov x2, x20         // NSURLRequest
// ... complex block setup ...
adrp x1, L_sel_dataTaskWithRequest@PAGE
ldr x1, [x1, L_sel_dataTaskWithRequest@PAGEOFF]
bl _objc_msgSend
mov x22, x0         // Store task

// Resume task
mov x0, x22
adrp x1, L_sel_resume@PAGE
ldr x1, [x1, L_sel_resume@PAGEOFF]
bl _objc_msgSend

Authentication and oauth flows

// OAuth 2.0 authentication flow
- (void)authenticateWithOAuth {
    // Configure OAuth parameters
    NSDictionary *params = @{
        @"client_id": @"your-client-id",
        @"client_secret": @"your-client-secret",
        @"grant_type": @"authorization_code",
        @"code": self.authorizationCode,
        @"redirect_uri": @"your-app://oauth-callback"
    };

    // Create request
    NSMutableURLRequest *request = [NSMutableURLRequest requestWithURL:[NSURL URLWithString:@"https://oauth.example.com/token"]];
    [request setHTTPMethod:@"POST"];
    [request setValue:@"application/x-www-form-urlencoded" forHTTPHeaderField:@"Content-Type"];

    // Convert parameters to form body
    NSMutableArray *formItems = [NSMutableArray array];
    for (NSString *key in params) {
        NSString *encodedKey = [key stringByAddingPercentEncodingWithAllowedCharacters:[NSCharacterSet URLQueryAllowedCharacterSet]];
        NSString *encodedValue = [params[key] stringByAddingPercentEncodingWithAllowedCharacters:[NSCharacterSet URLQueryAllowedCharacterSet]];
        [formItems addObject:[NSString stringWithFormat:@"%@=%@", encodedKey, encodedValue]];
    }
    NSString *formBody = [formItems componentsJoinedByString:@"&"];
    [request setHTTPBody:[formBody dataUsingEncoding:NSUTF8StringEncoding]];

    // Execute request
    NSURLSession *session = [NSURLSession sharedSession];
    NSURLSessionDataTask *task = [session dataTaskWithRequest:request completionHandler:^(NSData *data, NSURLResponse *response, NSError *error) {
        if (error) {
            NSLog(@"Authentication error: %@", error);
            return;
        }

        // Parse token response
        NSError *jsonError;
        NSDictionary *tokenResponse = [NSJSONSerialization JSONObjectWithData:data options:0 error:&jsonError];
        if (jsonError) {
            NSLog(@"JSON parsing error: %@", jsonError);
            return;
        }

        // Store access token
        NSString *accessToken = tokenResponse[@"access_token"];
        NSString *refreshToken = tokenResponse[@"refresh_token"];
        self.tokenExpiration = [NSDate dateWithTimeIntervalSinceNow:[tokenResponse[@"expires_in"] doubleValue]];

        // Save tokens securely
        [KeychainManager saveAccessToken:accessToken refreshToken:refreshToken];
    }];

    [task resume];
}

Assembly representation of complex network flow:

// Set up URL request
adrp x0, _OBJC_CLASS_$_NSMutableURLRequest@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSMutableURLRequest@PAGEOFF]

// Load NSURL
adrp x21, _OBJC_CLASS_$_NSURL@PAGE
ldr x0, [x21, _OBJC_CLASS_$_NSURL@PAGEOFF]
adrp x2, L_url_string@PAGE
ldr x2, [x2, L_url_string@PAGEOFF]
adrp x1, L_sel_URLWithString@PAGE
ldr x1, [x1, L_sel_URLWithString@PAGEOFF]
bl _objc_msgSend
mov x2, x0             // URL for request

// Create request
adrp x0, _OBJC_CLASS_$_NSMutableURLRequest@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSMutableURLRequest@PAGEOFF]
adrp x1, L_POST_method@PAGE
ldr x2, [x1, L_POST_method@PAGEOFF]
adrp x1, L_sel_setHTTPMethod@PAGE
ldr x1, [x1, L_sel_setHTTPMethod@PAGEOFF]
bl _objc_msgSend

// More request setup...
// ... (headers, parameters, etc.)

// Get shared session
adrp x0, _OBJC_CLASS_$_NSURLSession@PAGE
ldr x0, [x0, _OBJC_CLASS_$_NSURLSession@PAGEOFF]
adrp x1, L_sel_sharedSession@PAGE
ldr x1, [x1, L_sel_sharedSession@PAGEOFF]
bl _objc_msgSend
mov x20, x0            // Session object

// Create data task with request and completion block
mov x0, x20            // Session
mov x2, x20            // NSURLRequest
// ... complex block setup ...
adrp x1, L_sel_dataTaskWithRequest@PAGE
ldr x1, [x1, L_sel_dataTaskWithRequest@PAGEOFF]
bl _objc_msgSend
mov x21, x0            // Task

// Resume task
mov x0, x21
adrp x1, L_sel_resume@PAGE
ldr x1, [x1, L_sel_resume@PAGEOFF]
bl _objc_msgSend

Anti-analysis techniques

The following examples show techniques that can complicate reverse engineering and binary analysis.

Dynamic library subversion

Delayed loading and runtime injection

Obfuscated software can avoid static detection by loading libraries only when needed:

// Direct library load (easy to identify statically)
#include <dlfcn.h>
void* handle = dlopen("libexample.dylib", RTLD_LAZY);

// Less obvious statically: constructing the library name at runtime
char lib_name[20] = {0};
strcpy(lib_name, "lib");
strcat(lib_name, "example");
strcat(lib_name, ".dylib");
void* handle = dlopen(lib_name, RTLD_LAZY);

When disassembled, the second approach shows no obvious library names:

// Disassembly shows only strcpy/strcat calls with partial strings
adrp x0, l_buffer@PAGE
add x0, x0, l_buffer@PAGEOFF
adrp x1, l_lib@PAGE       // Just "lib"
add x1, x1, l_lib@PAGEOFF
bl _strcpy

adrp x0, l_buffer@PAGE
add x0, x0, l_buffer@PAGEOFF
adrp x1, l_mal@PAGE       // Just "mal"
add x1, x1, l_mal@PAGEOFF
bl _strcat
// ... more strcat calls ...

DYLD_INSERT_LIBRARIES behavior

The macOS dynamic linker can be influenced through environment variables:

# Inject library into all processes started by the shell
export DYLD_INSERT_LIBRARIES=/path/to/example.dylib

# Less obvious in static analysis:
char cmd[256];
snprintf(cmd, sizeof(cmd), "launchctl setenv DYLD_INSERT_LIBRARIES %s",
         "/path/to/example.dylib");
system(cmd);

Function hooking via dynamic loader

Intercepting functions by manipulating the dynamic linker’s symbol tables:

// Hook implementation
int hooked_open(const char *path, int flags, ...) {
    // Log or modify parameters
    printf("Opening file: %s\n", path);

    // Call original
    va_list args;
    va_start(args, flags);
    mode_t mode = va_arg(args, mode_t);
    va_end(args);

    // Original function pointer obtained through dlsym
    static int (*original_open)(const char*, int, ...) = NULL;
    if (!original_open)
        original_open = dlsym(RTLD_NEXT, "open");

    return original_open(path, flags, mode);
}

Assembly after disassembly will show the calls to dlsym but not what function is being hooked:

// Complex resolution that hinders analysis
adrp x19, l_function_name@PAGE
ldr x19, [x19, l_function_name@PAGEOFF]
mov x0, #-2                 // RTLD_NEXT
mov x1, x19
bl _dlsym

These techniques can complicate reverse engineering by hiding loaded libraries, obscuring function calls, or changing how analysis tools observe runtime behavior.

Why stack deallocation is necessary

Stack deallocation, which restores the stack pointer, is required for several reasons:

// Function with stack allocation and deallocation
function_example:
    // Allocate 32 bytes on stack
    sub sp, sp, #32

    // Use stack space...

    // Deallocate 32 bytes
    add sp, sp, #32
    ret

Memory management

Stack exhaustion: Without deallocation, each function call would permanently consume stack space. Since the stack is a limited resource (typically just a few MB), the program would quickly run out of stack space, resulting in a stack overflow crash.
Resource efficiency: The stack size is finite and allocated per thread. Improper deallocation wastes this limited resource.

Function call convention

Caller/callee contract: The calling convention requires that functions restore the stack to its original state before returning. This is a fundamental contract between caller and callee.
Predictable return state: The caller expects the stack to be in the same state after the callee returns. Violating this expectation breaks the function call mechanism.

Stack frame integrity

Return address integrity: If the stack is not properly deallocated, the return address may not be at the expected location, causing unpredictable returns.
Frame pointer chain: Improper stack management breaks the frame pointer chain, corrupting stack traces for debugging.

What happens without deallocation

// Problematic function that doesn't deallocate stack
bad_function:
    // Allocate 32 bytes on stack
    sub sp, sp, #32

    // Use stack space...

    // Return WITHOUT deallocating!
    ret

This bad practice causes:

Progressive stack loss: Each call to bad_function permanently loses 32 bytes of stack
Eventual crash: After enough calls, a stack overflow error occurs
Corrupted returns: If other functions are called, their return addresses will be misplaced
Undefined behavior: The program may work for a while but exhibit increasingly erratic behavior

Stack memory vs heap memory

Unlike heap memory, which persists until explicitly freed, stack memory is tied to function execution scope:

Heap: malloc() allocates, memory persists until free() is called
Stack: Allocation is managed by stack pointer adjustment, and must be balanced within each function
Automatic cleanup: While local variables are automatically “cleaned up” when they go out of scope, the stack pointer itself must be explicitly restored

Debugging stack issues

Stack allocation/deallocation bugs are often detected through:

Stack corruption errors
Unexpected program crashes
Call stack corruptions in crash reports
Pointer arithmetic errors with stack-based data
Compiler warnings about frame pointer usage

In performance-critical code, functions may sometimes omit frame pointers (via compiler optimizations), but proper stack pointer management is always required for program correctness.

Low-level programming, binary analysis, and security

Table of contents

Fundamentals

Machine code

Key characteristics

Assembly language

Example assembly instruction

Labels and control flow

Compilation and linking

Compilation pipeline

Preprocessing stage

Assembly stage

Linking stage

Executable file formats

Mach-O file format structure

Mach-O header

Key Mach-O sections

Symbol management

Symbol table structure

Symbol resolution during static linking

Symbol table in binary analysis

Dynamic symbol resolution

Global offset table (GOT) and position-independent code

Position-independent code

Simple analogy

What makes code position-dependent?

What makes code position-independent?

Technical implementation

Why position-independent code matters

Detailed GOTPAGE/GOTPAGEOFF example

Memory layout example

Runtime behavior

Disassembled view

Advantages of this approach

Advanced dyld symbol resolution examples

Lazy binding

Symbol interposition

Framework symbol resolution

Program initialization

System architecture

Operating system concepts

Privilege levels and memory access

Kernel mode vs user mode

Process management

Process creation and identification

System calls

Memory management

Virtual memory and page tables

Memory protection

Heap vs stack memory management

Stack memory characteristics

Heap memory characteristics

Objective-C heap management

Manual allocation patterns (pre-ARC)

ARC (Automatic reference counting) patterns

Memory management best practices

Common memory management issues

Memory leaks

Use-after-free errors

Over-release errors

Debugging memory issues

Thread management

Thread creation and stack

ARM architecture

Execution levels

Register organization

General purpose registers

Special registers

ARM64 assembly reference

Register usage

General purpose registers

Special registers

Memory operations

Load and store instructions

Advanced memory addressing

Pair operations

Arithmetic and logic

Basic arithmetic

Multiply and divide

Bitwise operations