Senior Compiler Engineer, GPU Code Object Rewriting & Tooling

Advanced Micro Devices, IncSan Jose, CA
Hybrid

About The Position

We are building first-class compilation and code-object tooling for HIP, OpenCL, OpenMP, and the broader ROCm stack. Our compilers, loaders, and post-link tools underpin every HPC application and AI framework that runs on AMD GPUs. We are investing heavily in on-the-fly ISA rewriting and hot-patching infrastructure — inside the Code Object Manager (COMGR) and the AMDGPU backend — that lets us ship hardware fixes, errata workarounds, instrumentation, and performance experiments without recompiling user code. We are looking for a versatile Senior Compiler Engineer who can move fluidly between LLVM MC-level rewriting, ELF/DWARF manipulation, AMDGPU codegen, and the tooling that ties it all together. This is a multi-year investment area: the rewriting infrastructure starts as an errata-mitigation platform and grows into a long-term foundation for post-link transformation, binary instrumentation, and experimentation across multiple generations of AMD GPU silicon. You will own this codebase as it matures.

Requirements

  • Strong C/C++ programming skills, with a demonstrated ability to write careful, bounds-checked code against untrusted binary input
  • Strong background in compilers and compiler IRs — LLVM IR, MachineIR, or an equivalent production compiler stack
  • Hands-on experience with the LLVM MC layer (MCInst, MCDisassembler, MCCodeEmitter, MCStreamer, TargetRegistry)
  • Experience designing or extending custom in-tree IRs — pass infrastructure, dataflow analyses, SSA construction, dominance, and target-specific lowering — particularly in the context of lifting low-level code into a more analyzable form
  • Exposure to binary lifting / raising — llvm-mctoll, QEMU TCG lifting, RetDec, BAP, angr, or Ghidra P-code — and the practical challenges of reconstructing SSA and control flow from disassembled machine code
  • Working knowledge of ELF, DWARF, and related object-file formats; comfort reading and modifying binaries at the byte level
  • Familiarity with GPU ISAs (AMDGPU / GCN / RDNA / CDNA, or NVIDIA PTX/SASS) — registers, encodings, branch ranges, scheduling constraints
  • Experience with dataflow analyses (liveness, reaching-definitions, dominance) and basic register allocation
  • Understanding of GPU execution models: waves/warps, VGPRs/SGPRs, LDS, kernel descriptors, launch bounds, occupancy
  • Clang/LLVM upstream contribution experience
  • Exposure to the ROCm stack (COMGR, HIP, HSA runtime, hipify) or an equivalent heterogeneous toolchain
  • Background in any of: debug information (DWARF/PDB), binary instrumentation, dynamic binary translation, JIT engines, linker internals, or code-object loaders

Responsibilities

  • Design, implement, and maintain the HotSwap ISA rewriting subsystem in COMGR (amd/comgr/src/comgr-hotswap-) — including ELF patching, DWARF debug-line adjustment, trampoline growth, NOP-sled management, and branch encoding
  • Build and extend LLVM MC-based disassembly, assembly, and re-encoding pipelines used by post-link transformation tools
  • Prototype and evaluate raising-based rewriting pipelines — lifting disassembled AMDGPU machine code into a structured intermediate representation (LLVM MachineIR or a domain-specific in-tree IR) for analysis and transformation, then lowering back to valid code objects
  • Author ISA-specific rewrite policies (e.g., GFX1250 B0-to-A0 style errata mitigations) and generalize them into reusable, ISA-parametric infrastructure
  • Implement and harden CFG construction, backward liveness analysis, and scratch VGPR allocation on raw AMDGPU machine code
  • Adjust ELF section/program headers, AMDGPU notes, kernel descriptors, and code-object metadata safely on malformed or adversarial inputs
  • Contribute to the AMDGPU LLVM backend, Clang driver, and LLD where rewriting needs first-class compiler support
  • Participate in new architecture and silicon bring-ups — owning the compiler/tooling path from bring-up workarounds to long-term codegen quality
  • Analyze, reproduce, and fix issues across the compiler, loader, and runtime boundary; build unit tests, fuzzers, and regressions for each fix
  • Collaborate with ROCm runtime, HSA, and hardware architecture teams spread across geographic locations
  • Represent AMD in open-source communities (e.g., LLVM) and relevant standards bodies (e.g., DWARF Committee) through upstream patches, RFCs, and design reviews

Benefits

  • AMD benefits at a glance
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service