MTS Software System Design Engineer

Advanced Micro Devices, IncAustin, TX

About The Position

At AMD, our mission is to build great products that accelerate next-generation computing experiences—from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you’ll discover the real differentiator is our culture. We push the limits of innovation to solve the world’s most important challenges—striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.

Requirements

  • Master’s degree or foreign equivalent in Computer and Information Science, Computer Engineering, Electrical Engineering or related field and three years’ experience in the job offered or a closely related engineering role.
  • OR Bachelor’s degree or foreign equivalent in Computer and Information Science, Computer Engineering, Electrical Engineering or related field and five (5) years of progressive post baccalaureate experience in the job offered or a closely related engineering role.
  • Designing and implementing large-scale infrastructure solutions
  • Kubernetes and container orchestration technologies
  • AI/ML workloads in production environments
  • Datacenter networking and storage architectures
  • GPU/AI-accelerated computing environments
  • Creating technical documentation and reference architectures
  • Infrastructure automation and orchestration tools
  • Performance optimization for large-scale inference deployments
  • Ray, PyTorch, and HPC optimized schedulers for Kubernetes based AI training
  • SLURM or similar HPC schedulers
  • Infrastructure-as-code tools such as Terraform or Ansible
  • Performance tuning for GPU/AI-accelerated workloads
  • Creating automation tools for infrastructure deployment

Responsibilities

  • Design, test, and validate reference architectures for large-scale AI training and inference clusters.
  • Develop comprehensive tools for AI training to enable efficient cluster management.
  • Create detailed reference documentation and implementation guides for customers and internal teams.
  • Serve as the primary technical interface with customer engineering teams during deployment planning.
  • Conduct proof-of-concept implementations to validate designs in real-world scenarios.
  • Evaluate and benchmark performance of various infrastructure configurations.
  • Provide expert guidance on optimizing Kubernetes for AI workloads at scale.
  • Collaborate with product management to influence roadmap based on customer requirements.
  • Maintain deep technical expertise in emerging AI infrastructure technologies.
  • Coordinate customer requirements gathering and work with the relevant Technical Program Management counterpart to arrive at a deployment plan.
  • Creation of comprehensive, tested reference architectures that accelerate customer deployments.
  • Drive test and interoperability validation with our HW and SW partners, lead implementation of reference datacenter solutions at our CSP partners.
  • Development of automation tools that significantly reduce deployment complexity.
  • Establishment as a trusted advisor to customer technical teams.
  • Contribution to increased win rates through technical credibility and expertise.
  • Regular feedback that improves our product roadmap and offering.

Benefits

  • AMD benefits at a glance
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service