Systems Engineer Jobs

10,000 jobs found — updated daily

Manager, Distinguished Engineer - DGX Systems Software

NVIDIASanta Clara, CA
$320,000 - $488,750

About The Position

NVIDIA DGX systems are the foundation of the world’s most advanced AI infrastructure, comprising purpose-built servers, workstations, and personal AI computers that integrate GPUs, CPUs, NVLink, NVIDIA Networking, and an optimized AI software stack. This role seeks an engineering leader responsible for the end-to-end delivery of every DGX compute system, from firmware through the AI stack to customer deployment. The successful candidate will ensure each DGX product ships as a production-ready system where firmware, OS, drivers, CUDA, networking, and AI applications function seamlessly together, while also driving the architecture and roadmap for next-generation platforms. NVIDIA is a leader in groundbreaking developments in Artificial Intelligence, High-Performance Computing, and Visualization, with the GPU at the core of its products and services.

Requirements

  • BS or MS in Computer Science, Electrical Engineering, or related field or equivalent experience.
  • 12+ overall years in systems firmware/software engineering, with 5+ years in engineering leadership.
  • Deep expertise in server system stack including SBIOS, BMC, OS, applications and system-level integration of complex multi-component products.
  • Proven track record delivering multi-generation server or data center platforms from architecture through customer deployment.
  • Experience managing engineering organizations across multiple geographies in a matrix environment.
  • Strong understanding of server hardware: CPU, GPU, interconnect, memory, PCIe, power delivery.
  • Experience owning end-to-end product quality—from firmware validation through full-stack system testing to field deployment.

Nice To Haves

  • Experience with NVIDIA DGX, or GPU-accelerated server platforms.
  • Track record driving server bring-up for new silicon and system architecture redesigns.
  • Familiarity with DMTF Redfish, OCP standards, and server manageability ecosystems.
  • Experience with AI/DL workload validation and performance optimization at the platform level.
  • Demonstrated ability to operate at VP/SVP level, influencing cross-BU strategic decisions.

Responsibilities

  • Ensure every DGX platform is ready for the full NVIDIA software stack—firmware, DGX OS, GPU drivers, CUDA toolkit, DCGM, DOCA/OFED, and management tools—as a validated, production-quality product.
  • Own the GA SW/FW release process delivering firmware bundles, BaseOS ISOs, and release notes to OEM/OSV partners.
  • Ensure platforms support AI agents like NemoClaw, Hermes agents, NIM microservices, and workloads customers expect out of the box.
  • Lead development of the manageability firmware stack (BMC, BIOS) for all DGX platforms.
  • Ensure firmware from partner teams (GPU, CPU, networking) integrates correctly at system level.
  • Manage 3rd-party vendors and drive platform requirements (NVPOR) across all firmware areas.
  • Define validation strategy proving each DGX platform is production-ready: end-to-end system validation including firmware regression, NVQual certification, DL workload performance, OS/CUDA stack testing, multi-user scenarios, power/thermal validation, and field upgrade reliability.
  • Establish quality gates and zero ship-stopper discipline.
  • Drive platform bring-up for each new DGX system—coordinating first boot across new silicon (CPU, GPU), board design, and firmware teams.
  • Own architectural strategy for next-generation platforms including firmware update mechanisms, system security posture, and AI application readiness.
  • Ensure firmware release flows meet CSP and enterprise deployment requirements.
  • Represent DGX platform readiness in executive reviews and strategic planning with VP/SVP leadership.
  • Engage with industry standards bodies (DMTF Redfish, OCP).
  • Own the complete DGX delivery lifecycle—system architecture, firmware development, integration, full-stack validation, GA release, and customer deployment—for every DGX product.
  • Serve as single point of accountability for DGX platform readiness across NVIDIA—aligning GPU, CPU, networking, security, OS, and AI software teams to deliver on schedule.
  • Own RCCA processes for field issues.
  • Manage external vendor partnerships (AMI for SBIOS, BMC contributors) with clear quality gates and program tracking.
  • Build and lead a world-class engineering organization.
  • Mentor and develop leaders.
  • Foster a culture of technical excellence, intellectual honesty, and customer obsession.

Benefits

  • equity
  • benefits

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Manager

Number of Employees

5,001-10,000 employees

Career Resources

Build a Resume for Systems Engineer

The resume builder that gets results.

  • Get clear feedback so you look as qualified as you are
  • Align your resume with the job to get further in the process, faster
  • Take the guesswork out of resume writing

Explore Related Job Searches

© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service