Director, Program Management

Advanced Micro Devices, IncAustin, TX
5h

About The Position

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career. We are seeking a Director-level Technical Program Manager to serve as the Product Owner for AMD’s internal infrastructure platform supporting Instinct™ GPU development and deployment. In this role, you will drive the transformation of AMD’s global data center infrastructure into a highly automated, scalable, and resilient platform capable of supporting rapidly growing AI, HPC, and accelerated computing workloads. You will operate at the intersection of Platform Engineering, Site Reliability Engineering (SRE), and Infrastructure Operations, defining the long-term vision and execution strategy for infrastructure automation, provisioning, and lifecycle management. Your work will directly enable AMD’s Instinct product roadmap by ensuring infrastructure readiness, availability, and scalability across global deployments. This role provides significant visibility and impact, requiring close engagement with senior engineering leaders and executive stakeholders while shaping the technical and operational foundation for AMD’s next generation AI platforms. You bring a strong infrastructure product mindset and thrive in environments where hardware, software, and automation intersect at scale. You have a proven track record leading complex, global infrastructure programs involving heterogeneous systems, high-density compute platforms, and distributed data center environments. You are a systems thinker who goes beyond traditional program management, designing scalable processes, automation strategies, and operational frameworks that enable infrastructure to scale efficiently without proportional increases in operational overhead. You communicate effectively across technical and executive audiences, confidently engaging with platform engineers, SRE teams, and senior leadership while driving alignment and execution across global teams. You are proactive, structured, and execution-focused, with the ability to translate long-term infrastructure vision into clear milestones, measurable progress, and successful deployment outcomes.

Nice To Haves

  • Experience leading large-scale infrastructure or technical programs involving high-density GPU platforms, AI clusters, or HPC environments
  • Strong background in infrastructure automation, provisioning, and lifecycle management using infrastructure-as-code frameworks
  • Experience working with HPC workload managers such as SLURM and container orchestration platforms such as Kubernetes
  • Familiarity with automation tools and scripting languages such as Python, Ansible, and shell scripting
  • Experience supporting large-scale data center environments, including high-performance networking and storage platforms such as Weka, Pure Storage, or NetApp
  • Experience managing complex technical programs using project management tools such as Jira, including hardware lifecycle and service workflows
  • Strong ability to develop infrastructure roadmaps, executive reporting, and deployment readiness tracking frameworks
  • Experience working in environments supporting AI, machine learning, or accelerated compute platforms

Responsibilities

  • Define and drive the multi-year vision, roadmap, and execution strategy for AMD’s internal infrastructure platform supporting Instinct GPU development and validation
  • Own global infrastructure deployment readiness, tracking progress, risks, dependencies, and operational health across multiple data center locations
  • Lead and scale automation initiatives for provisioning, configuration management, and hardware lifecycle management across heterogeneous infrastructure environments
  • Serve as the primary technical program interface to executive leadership, providing clear reporting on infrastructure readiness, platform health, and roadmap alignment
  • Coordinate cross-functional efforts across Platform Engineering, SRE, Infrastructure Operations, and product teams to ensure timely infrastructure availability
  • Establish scalable operational processes, automation workflows, and lifecycle management strategies to support global infrastructure growth
  • Drive adoption of infrastructure-as-code, automated provisioning, and lifecycle orchestration to improve deployment speed, consistency, and reliability
  • Enable infrastructure scalability to support accelerated product development, validation, and deployment timelines aligned with AMD’s Instinct roadmap

Benefits

  • AMD benefits at a glance.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service