Cloud Platform Software Engineer

NVIDIASanta Clara, CA

About The Position

We are the Platform API team within NVIDIA's DGX Cloud organization. We are a collaborative group of cloud platform engineers, architects, and SREs who are passionate about building and nurturing the declarative, Kubernetes-native control plane that powers GPU-accelerated infrastructure across multiple cloud providers. Together, we're empowering the world's leading AI teams to train and deploy at datacenter scale. We design and extend Kube-like APIs, and we craft Go-based reconciliation controllers that thoughtfully turn high-level intent into production-ready AI infrastructure. We take pride in owning our code end-to-end, and we care deeply about the full lifecycle of multi-cloud GPU clusters, from customer onboarding and provisioning through upgrades and decommissioning. We partner closely with our runtime, cloud architecture, observability, and storage teams to solve sophisticated distributed systems challenges together. As a team, we're strengthening NVIDIA's approach to Cloud Native development.

Requirements

  • BS in Computer Science, Information Systems, Computer Engineering or equivalent experience
  • 5+ years of proven experience in large scale software development
  • Experience building and shipping services on Kubernetes
  • Background with using and chipping in to open-source projects
  • Collaborated with teams to write software to support cloud services at scale
  • Programming experience in a relevant language, e.g. Golang, Python
  • Communicate design and quality strategy in written, visual, and oral formats
  • Experience with a wide range of modern infrastructure tools and technologies

Nice To Haves

  • Experience with Kubernetes Cluster API, Terraform, Tinkerbell, and other infrastructure tooling
  • Practical experience with Azure, GCP, or AWS
  • Capable of refactoring software to run in systems such as Kubernetes
  • Ability to discuss and work with CSI, CNI, and CRI and/or familiarity with the CNCF and the tooling across the ecosystem
  • Upstream contribution in open source projects

Responsibilities

  • Develop software systems to support large scale deployments of cloud infrastructure
  • Design and develop APIs to support Infrastructure as Code (IaC) automation and deployment workflows.
  • Responsible for contributing to multiple source code projects to fulfill NVIDIA requirements with software services
  • Work and collaborate with engineering managers, architects, designers, and frontend engineers to deliver high quality software
  • Automate the validation of software solutions with unit and integration tests
  • Participate in the ownership and health of CI/CD pipelines from dev to production environments
  • Collaborate with other specialists for feedback on proposed designs and product direction
  • Openly share successes and failures in a no blame environment

Benefits

  • equity
  • benefits

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Education Level

Associate degree

Number of Employees

5,001-10,000 employees

© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service