About The Position

NVIDIA is at the forefront of defining the next era of computing through AI, with their GPUs acting as the brains for advanced systems like robots and self-driving cars. The company fosters a diverse and supportive environment, attracting top talent globally. The DGX Cloud team is seeking a Lead Software Engineer to develop foundational systems for NVIDIA's high-performance GPU infrastructure. This role involves technical leadership in designing scalable cloud services that integrate with GPU telemetry in datacenters and enable operational automation across global cloud operations.

Requirements

  • At least 12+ years of industry experience with a Bachelor’s or Master’s degree (or equivalent experience); PhD degree preferred
  • Expertise in building scalable REST APIs backed by PostgreSQL-compatible data stores
  • Proficiency in programming languages such as Go or Python
  • Familiarity with modern JavaScript frameworks (e.g., React, Angular, Next.js)
  • Expertise in cloud infrastructure (AWS, GCP, Azure, etc) and container technologies like Docker and Kubernetes
  • Expertise with high-scale distributed systems, including architectural patterns for APIs and data pipelines
  • Outstanding communication and collaboration skills, with a focus on solving complex operational challenges
  • A passion for delivering scalable and efficient cloud services
  • Familiarity with Linux operating systems

Nice To Haves

  • A track record of leading engineers to successful delivery and operations of high-performance cloud services at Internet scale
  • Experience operating NVIDIA datacenter GPUs
  • Strong debugging and problem-solving skills in distributed environments

Responsibilities

  • Act as technical lead for a team of software engineers designing cloud services backed by databases and data warehouses
  • Design and develop RESTful APIs to ingest telemetry from AI datacenters
  • Build scalable cloud services for high-volume ingestion, processing, and storage of large datasets
  • Build and manage data pipelines for online and offline data storage
  • Collaborate across teams to codify business processes into scalable, self-measuring systems
  • Optimize the reliability and efficiency of cloud services and operations
  • Lead and ship impactful technical projects, ensuring quality and scalability at every stage

Benefits

  • equity
  • benefits

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Number of Employees

5,001-10,000 employees

© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service