About The Position

Join us in building the next generation of AI compilers. You’ll play a key role in developing the compiler for our novel AI accelerator, working side-by-side with hardware engineers and ML researchers. Your work will shape how deep learning workloads run on cutting-edge dataflow hardware—defining the instruction set, execution model, and developer experience. The result: a compiler that delivers breakthrough performance while remaining seamless and intuitive for ML developers.

Requirements

  • 3+ years of experience building compilers or high-performance systems software, especially those involving complex resource management or optimization
  • Expert in modern C++ (C++14/17/20) and strong Python
  • Experience with compiler IRs (SSA-based or graph-based), transformations, and code generation
  • Exposure to specialized accelerators (GPU, NPU, FPGA, or custom ASIC) or parallel architectures

Nice To Haves

  • Experience with machine learning compiler stacks (e.g., ONNX, MLIR, TVM, XLA, IREE, PyTorch), with contributions to MLIR or LLVM projects a plus
  • Experience with optimization methods (LP/MIP, CP, SAT/SMT) using solvers like Gurobi or OR-Tools for scheduling and resource allocation
  • Experience compiling for specialized accelerators (GPU, NPU, FPGA, or custom ASIC) on DNN workloads; GPU/DSP experience is valuable if combined with compiler backend work beyond kernel tuning
  • Familiarity with heterogeneous compilation, especially mixing custom accelerators with CPUs/GPUs/NPUs, and exposure to analog or in-memory compute is a plus
  • Experience collaborating in compiler–hardware co-design (architecture + ISA) for better compiler usability and hardware efficiency

Responsibilities

  • Contribute across the full compiler stack, including operator lowering, graph/IR transformations, optimization passes, and backend code generation
  • Optimize for dataflow architectures, developing pipelined schedules, memory orchestration, and resource-constrained execution strategies
  • Collaborate with hardware architects to influence architectural features, ensuring the compiler and hardware evolve together
  • Develop compilation strategies that unify our analog compute with digital subsystems
  • Build and maintain a compiler that produces high-performance binaries with strong debugging support, clear error messages, and predictable performance models

Benefits

  • The opportunity to shape how deep learning and LLM workloads are compiled on novel hardware
  • A role that spans software and hardware co-design, shaping both the compiler and the accelerator architecture
  • A collaborative, innovative team that values engineering rigor, continuous integration, and user-focused design
  • Competitive compensation, equity, and benefits package
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service