About The Position

As the world changes direction toward Generative AI, the network has become the computer. We are looking for a visionary leader to head our USA Networking Cluster Validation team, where you will simulate the world’s largest AI data centers to ensure our InfiniBand and Ethernet solutions define the next era of computing. If you want to sit at the epicenter of the AI revolution and push the limits of hyper-scale networking, join us at NVIDIA. What you’ll be doing: Lead a high-performance engineering team dedicated to the qualification and integration of groundbreaking Networking AI/HPC cluster solutions. Direct the design and testing of massive NVIDIA setups that simulate the production workloads of the world’s largest AI data center customers. Partner with R&D to review architectural designs and requirements for next-generation features across the entire Ethernet and InfiniBand portfolio (switches and network adapters). Oversee the creation of complex network topologies to ensure comprehensive product coverage, emphasizing the emulation of complex customer environments at scale. Drive the roadmap for the testing automation team, ensuring seamless integration of new features into the software release cycles for data center products. Serve as the primary Engineering Lead (PIC) for full verification cycles; assist in debugging complex customer use cases and perform root-cause analysis for critical system issues. Manage comprehensive testing scopes including Regression, Performance, Functional, and Scale, providing executive-level summary reports on release readiness. Foster a culture of technical excellence by mentoring team members and driving professional growth within the organization.

Requirements

  • Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or equivalent experience.
  • 8+ years of overall technical experience in networking or systems engineering.
  • 5+ years of experience in a formal team leadership or engineering management role.
  • Proven ability to multi-task, drive people toward deadlines, and manage high-priority tasks in a fast-paced environment.
  • Excellent communication and technical presentation skills; the ability to explain complex technical concepts to both R&D and executive stakeholders.
  • Strong debugging, analytical, and problem-solving skills with a "fast-learner" approach.

Nice To Haves

  • Proven experience in testing and qualifying AI cluster infrastructure, including performance tuning for large-scale GPU-to-GPU communication.
  • Deep experience with technologies like KVM, HyperV, or Kubernetes.
  • Advanced knowledge of InfiniBand and Ethernet protocols (RDMA, RoCE).
  • Hands-on experience in programming or scripting (Python, Bash) for automated validation frameworks.

Responsibilities

  • Lead a high-performance engineering team dedicated to the qualification and integration of groundbreaking Networking AI/HPC cluster solutions.
  • Direct the design and testing of massive NVIDIA setups that simulate the production workloads of the world’s largest AI data center customers.
  • Partner with R&D to review architectural designs and requirements for next-generation features across the entire Ethernet and InfiniBand portfolio (switches and network adapters).
  • Oversee the creation of complex network topologies to ensure comprehensive product coverage, emphasizing the emulation of complex customer environments at scale.
  • Drive the roadmap for the testing automation team, ensuring seamless integration of new features into the software release cycles for data center products.
  • Serve as the primary Engineering Lead (PIC) for full verification cycles; assist in debugging complex customer use cases and perform root-cause analysis for critical system issues.
  • Manage comprehensive testing scopes including Regression, Performance, Functional, and Scale, providing executive-level summary reports on release readiness.
  • Foster a culture of technical excellence by mentoring team members and driving professional growth within the organization.

Benefits

  • competitive salaries
  • generous benefits package
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service