About The Position

NVIDIA is seeking a highly motivated technical leader to design, drive, and operationalize rack-scale factory and deployment flows for next-generation data center products. The ideal candidate will possess deep systems expertise, decisive technical leadership, and a passion for building reliable, debuggable, and scalable manufacturing and deployment solutions. This role is crucial for NVIDIA's rapidly expanding ecosystem of data center platform & node designs, which are core to the company's growing enterprise and cloud provider businesses. These designs integrate NVIDIA GPUs, NVLink, InfiniBand networking, Grace CPUs, and an optimized AI and HPC software stack.

Requirements

  • BS or MS degree in Computer Engineering, Computer Science, or related degree or equivalent experience.
  • 8+ years in the area of System architecture and design.
  • Deep experience in designing architecture for scalable and performant server systems, particularly at the SW/HW interface.
  • Strong understanding of networking technology & protocols (e.g. Ethernet, Infiniband).
  • Previous experience working with complex system software for accelerators such as GPUs, DPUs, or FPGAs.
  • Expertise in out-of-band and in-band management architectures.
  • Knowledge of system management protocols such as Redfish and IPMI.
  • Demonstrable experience in implementing left shift strategy to de-risk program execution.
  • Excellent written and verbal communication skills.

Nice To Haves

  • Knowledge of large-scale cloud and cluster level deployment and management systems.
  • Demonstrated track record of leading data center products across the entire lifecycle, spanning inception, pre-silicon development, post-silicon bring-up, manufacturing, and deployment.

Responsibilities

  • Lead and drive rack-scale/L11 flows for factory and initial data center deployment.
  • Design and implement end-to-end factory workflows, including firmware flashing sequences, security provisioning, and deployment of software mitigations.
  • Collaborate with data center architects, ODMs, and OEMs to define factory and data center requirements that ensure efficient and reliable production ramp.
  • Champion reliability, debuggability, and optimization in firmware, diagnostic, and deployment tool design.
  • Drive pre-silicon readiness for factory & manufacturing workflows for rack-scale products using NVIDIA's simulation & emulation technology.
  • Mentor architects and engineering teams to grow them into future leaders.
  • Make key technical decisions even when faced with ambiguity.

Benefits

  • Equity
  • Benefits
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service