Principal AI Software Engineer

Advanced Micro Devices, IncSan Jose, CA

About The Position

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career. THE ROLE: AMD AI Group is seeking a highly influential technical leader in the ROCm software for New Product Initiatives. This role requires driving innovation and developing next generation products across AMD’s broad portfolio, serving Instinct, Radeon, Ryzen, Embedded, Gaming and Autonomous Driving product lines. The ideal candidate will shape the end-to-end ROCm software and influence the full stack spanning compilers, kernels, runtime, libraries, models, frameworks, and performance optimization layers. A key expectation is strong hardware/software co-design leadership to maximize performance, efficiency, scalability, and programmability across diverse AMD products and workloads. THE PERSON: You are a strong technical leader who is able to work across organizations, align architecture with product goals, and influence innovation and execution. You bring deep expertise in at least one major area of the stack, while also demonstrating the curiosity, adaptability, and technical breadth to learn new domains, grow their impact, and influence the broader ecosystem. As a successful candidate, you are recognized for technical excellence, and the ability to deliver innovations through collaboration across silicon, system software, AI/ML frameworks, libraries, and application enablement teams.

Requirements

  • Knowledge in GPU architectures, basic knowledge of CPU architecture
  • Experience in AI/ML software stack spanning compilers, kernels, runtime, libraries, models, frameworks, and performance optimization layers
  • Understanding of GPU programming such as ROCm, CUDA, OpenCL, etc
  • Experience in hardware/software co-design, building high-performance products across the full product lifecycle.

Nice To Haves

  • Experience with operating systems (OS) and device driver development is a plus

Responsibilities

  • Hardware-Software Co-design: Collaborate across hardware architecture, compiler, math libraries, kernel and framework teams to influence future silicon features based on evolving AI workload trends.
  • Strong Execution: Deliver innovations and roadmap for AI software stack across all AMD products, ensuring AMD remains the platform of choice for top-tier AI customers.
  • Workload Performance Engineering: Lead the profiling, analysis, and tuning of large-scale models (LLMs, Diffusion, Multimodal, and MoE) to ensure "out-of-the-box" performance excellence on AMD hardware.
  • Ecosystem Innovation: Drive the development of advanced tools and frameworks for performance estimation, modeling, and automated reporting.
  • Customer Engagement: Partner with top customers and hyperscalers to understand their unique workload requirements and deliver tailored architectural wins and software optimizations.
  • Community & Open Source: Mentor and inspire other engineers and contribute to ROCm Opensource.

Benefits

  • AMD benefits at a glance.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service