Member of Technical Staff

Wafer•San Francisco, CA

About The Position

About the Role Wafer's mission is to maximize intelligence per watt, by building AI that optimizes AI itself. Our journey starts with GPU kernels, but will expand into every corner of ML systems and AI infrastructure. We're a small team (4 people) backed by Fifty Years, Y Combinator, Jeff Dean, and Woj Zaremba (co-founder of OpenAI), and we're looking for engineers who want to work at the intersection of AI agents and systems programming. You'll work directly with the founding team to build the systems that power our GPU optimization platform, from the agent framework that iterates on kernels, to the profiling infrastructure that connects to NCU and ROCprofiler, to the compiler tooling that analyzes PTX and SASS.

Requirements

Have deep technical intuition and can learn new domains quickly
Are comfortable working across the stack: Python, C++, TypeScript, CUDA
Can ship production code fast while maintaining quality
Want to work on some of the most interesting AI infra problems at a small company with no bullshit + ship fast culture.

Nice To Haves

GPU programming experience (CUDA, HIP, Triton)
Experience with profiling tools or compiler internals
Background in AI/ML research or agent systems
Publications or open-source work in relevant areas

Responsibilities

Build and improve our framework for GPU kernel optimization (multi-turn tool use, state management, reward signals)
Develop integrations with GPU profilers and compiler toolchains
Design the architecture for remote GPU execution across cloud GPUs
Work on trace analysis systems that help the agent diagnose performance bottlenecks
Ship features that engineers use daily, and that optimizes infrastructure that runs the world's AI (PyTorch, vLLM, NVIDIA, AMD, etc.)

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume