Skip to content

x86-64 & ARM64 Calling Conventions Demystified

Learn x86-64 and ARM64 calling conventions across Linux, macOS, and Windows. Master parameter passing, stack frames, and register preservation with hands-on assembly examples.

In Part 1 of this series, we explored how to make syscalls directly to the kernel. But most real programs rarely call syscalls directly — instead, they call functions, which may themselves invoke syscalls deep in the call stack.

When one function calls another, both must agree on a contract: Where do arguments go? Which registers can I clobber? Who cleans up the stack? This contract is called a calling convention, and it varies dramatically between platforms. Understanding calling conventions is essential for any low-level programmer.

This comprehensive guide covers the three major calling conventions you’ll encounter when writing cross-platform assembly:

ConventionPlatformsArchitecture
System V AMD64 ABILinux, macOS, FreeBSDx86-64
Microsoft x64Windowsx86-64
AAPCS64Linux, macOS, WindowsARM64

What Calling Conventions Define

A calling convention specifies the rules that both the caller and callee must follow:

  1. Parameter passing — Which registers hold arguments, and in what order
  2. Return values — Where the result goes
  3. Register preservation — Which registers must be saved/restored by the callee
  4. Stack layout — Alignment requirements, shadow space, red zones
  5. Cleanup responsibility — Who adjusts the stack pointer after the call

Stack Frame Visualization

Understanding calling conventions becomes much easier when you visualize the memory layout for each ABI. Below is a comparison of how each architecture structures its stack frame during a function call.

System V AMD64 (Linux/macOS)

The System V stack is optimized for speed. It uses 6 registers for arguments and provides a “Red Zone” for leaf functions.

Previous Frame (High Address)
Return Address (8 bytes)
Saved RBP (8 bytes)
Local Variables(rsp points here)

Red Zone (128 bytes)

Windows x64 (Microsoft)

Windows requires Shadow Space (also known as Home Space) allocated by the caller, even if arguments are passed in registers. This space is used for spilling registers during debugging.

Previous Frame (High Address)

Shadow Space (32 bytes)

For rcx, rdx, r8, r9

Return Address (8 bytes)
Saved RBP (8 bytes)
Local Variables(rsp points here)

ARM64 (AAPCS64)

ARM64 uses the Link Register (x30) to store the return address. The Frame Pointer (x29) and Link Register are typically saved as a pair at the start of the stack frame.

Previous Frame (High Address)

Saved x29 / x30 (16 bytes)

FP + Link Register

Local Variables

Stack Arguments (9+)


Interactive Lab: Calling Convention Register Mapper

Select a function signature preset and see exactly where each parameter lands across all three major calling conventions. Green = register, Red = stack overflow.


System V AMD64 ABI (Linux & macOS x86-64)

The System V AMD64 ABI is the most widely used calling convention on Unix-like systems. It’s efficient, using many registers for parameter passing which reduces stack operations.

Integer and Pointer Arguments

The first six integer/pointer arguments go in registers, in this order:

ArgumentRegister
1strdi
2ndrsi
3rdrdx
4thrcx
5thr8
6thr9

Arguments beyond the sixth are pushed onto the stack in reverse order (right-to-left).

Floating-Point Arguments

The first eight floating-point arguments use vector registers: xmm0 - xmm7.

Variadic Note: For variadic functions (like printf), the caller must set al (the low 8 bits of rax) to the total number of floating-point arguments being passed in vector registers. This allows the callee to optimize the saving of XMM registers.

Return Values

TypeLocation
Integer/pointer (≤64 bits)rax
Integer (128 bits)rax (low), rdx (high)
Floating-pointxmm0
Struct (small)rax/rdx or xmm0/xmm1
Struct (large)Caller passes hidden pointer in rdi

Register Preservation

This is critical: if you clobber a callee-saved register without restoring it, you’ll corrupt the caller’s state.

Caller-saved (volatile)Callee-saved (non-volatile)
rax, rcx, rdx, rsi, rdi, r8-r11rbx, rbp, r12-r15, rsp

Callee-saved means: if your function uses these registers, you must save them at entry and restore them before returning.

Stack Requirements

  • 16-byte alignment: The stack pointer (rsp) must be 16-byte aligned before the call instruction. Since call pushes an 8-byte return address, rsp will be misaligned by 8 bytes at function entry. Prologues typically subtract 8 (or 8 + 16n) to restore alignment.
  • Red zone: Leaf functions (functions that don’t call other functions) can use 128 bytes below rsp without adjusting the stack pointer. This optimization avoids prologue/epilogue overhead for simple functions.

Code Example

asm
// Linux x86-64: add two numbers, print result
// Assemble: cc -o example example.s
// Run: ./example

.section .data
fmt:    .asciz "Sum of %d and %d is %d\n"

.section .text
.global main

// int add_nums(int a, int b)
// Arguments: rdi = a, rsi = b
// Returns: rax = a + b
add_nums:
  lea     eax, [rdi + rsi]    // eax = rdi + rsi
  ret

main:
  push    rbp
  mov     rbp, rsp
  
  // Call add_nums(5, 3)
  mov     edi, 5              // First argument
  mov     esi, 3              // Second argument
  call    add_nums            // Result in eax
  
  // Call printf(fmt, 5, 3, result)
  mov     ecx, eax            // 4th arg: result
  mov     edx, 3              // 3rd arg: second number
  mov     esi, 5              // 2nd arg: first number
  lea     rdi, [rip + fmt]    // 1st arg: format string
  xor     eax, eax            // No vector arguments
  call    printf
  
  xor     eax, eax            // Return 0
  pop     rbp
  ret

Note the xor eax, eax before calling printf — this tells the variadic function that we’re not passing any floating-point arguments in vector registers. This is a key detail of the System V calling convention.


Microsoft x64 Calling Convention (Windows)

Windows uses a convention that prioritizes simplicity and debugging. Unlike System V, it requires the caller to provide “Shadow Space” for register arguments.

Key Differences from System V

AspectSystem VWindows x64
Register arguments64
Shadow spaceNone32 bytes required
Red zone128 bytesNone
Variadic handlingal = vector countSame as regular args

Argument Registers

Only four registers are used for arguments in the Windows calling convention:

ArgumentInteger/PointerFloating Point
1strcxxmm0
2ndrdxxmm1
3rdr8xmm2
4thr9xmm3

Shadow Space: The caller must always allocate 32 bytes of “shadow space” on the stack immediately above the return address. This space is technically “owned” by the callee, which can use it to store (spill) the first four parameters if needed for debugging or varargs processing.

Code Example (Windows)

asm
; Windows x64: add two numbers, print result
; Assemble: ml64 /c example.asm
; Link: link /subsystem:console example.obj msvcrt.lib

.data
fmt     db "Sum of %d and %d is %d", 10, 0

.code
extern printf:proc

add_nums proc
  lea     eax, [rcx + rdx]    ; First two args in rcx, rdx
  ret
add_nums endp

main proc
  sub     rsp, 40             ; 32 shadow + 8 alignment
  
  ; Call add_nums(5, 3)
  mov     ecx, 5
  mov     edx, 3
  call    add_nums
  
  ; Call printf(fmt, 5, 3, result)
  mov     r9d, eax            ; 4th arg
  mov     r8d, 3              ; 3rd arg
  mov     edx, 5              ; 2nd arg
  lea     rcx, fmt            ; 1st arg
  call    printf
  
  xor     eax, eax
  add     rsp, 40
  ret
main endp
end

AAPCS64 (ARM64)

ARM64 uses the AAPCS64 calling convention across all platforms, including Apple Silicon.

Note on Apple ARM64 (macOS) deviations: While Apple Silicon Macs follow AAPCS64 for the majority of cases, Apple’s ABI has some documented deviations from the standard AAPCS64 spec. Most notably: rules for passing small structs and for variadic functions (va_list layout) differ from Linux ARM64. Code that mixes Apple ARM64 and standard AAPCS64 assumptions in those areas may behave differently across platforms. For general integer/pointer arguments, the conventions are identical. See Apple’s ARM64 ABI documentation for details.

Argument Passing

ArgumentRegister
1st-8th integerx0-x7
1st-8th floatv0-v7

Register Preservation

Caller-saved (Volatile)Callee-saved (Non-Volatile)
x0-x18x19-x28, x29 (fp), x30 (lr), sp

Code Example (ARM64)

asm
// ARM64 Linux: add two numbers, print result
// Assemble: cc -o example example.s
// Run: ./example

.section .data
fmt:    .asciz "Sum of %d and %d is %d\n"

.section .text
.global main

add_nums:
  add     w0, w0, w1          // w0 = w0 + w1
  ret

main:
  stp     x29, x30, [sp, #-16]!
  mov     x29, sp
  
  // Call add_nums(5, 3)
  mov     w0, #5
  mov     w1, #3
  bl      add_nums
  
  // Call printf(fmt, 5, 3, result)
  mov     w3, w0              // 4th arg
  mov     w2, #3              // 3rd arg
  mov     w1, #5              // 2nd arg
  adrp    x0, fmt
  add     x0, x0, :lo12:fmt
  bl      printf
  
  mov     w0, #0
  ldp     x29, x30, [sp], #16
  ret

Side-by-Side Comparison

FeatureSystem V AMD64Windows x64AAPCS64
Integer args6 regs4 regs8 regs
Float args8 regs4 regs8 regs
Red zone128 bytesNoneNone
Shadow spaceNone32 bytesNone
Stack alignment16 bytes16 bytes16 bytes
Return valueraxraxx0

Common Pitfalls and Debugging Tips

1. Stack Alignment (The 16-byte rule)

All three major 64-bit conventions require the stack pointer (rsp or sp) to be 16-byte aligned before any call or bl instruction. Failure to maintain this alignment often results in a segfault when the callee executes an SSE, AVX, or NEON instruction that expects aligned memory.

2. Shadow Space Corruption

On Windows, even if you pass zero arguments, you must subtract 32 from rsp before calling a function. If you don’t, the callee may overwrite your return address when it spills its (non-existent) register arguments into the shadow space it assumes you provided.

3. Clobbering Callee-Saved Registers

If your function uses rbx, rbp, or r12-r15 (on x86-64) or x19-x28 (on ARM64), you must save them to the stack and restore them before returning.


What’s Next?

Now you understand the contracts governing function communication. In Part 3, we’ll explore program startup: what happens before main() runs and how command-line arguments are passed.

Coder Musings

A modern technical laboratory for systems programming. Master Assembly, Compilers, and Low-Level Engineering through curated paths and interactive visualizations.

© 2026 Coder Musings. All rights reserved. Built for the systems community.