Senior Software Engineer, AI Infrastructure

NvidiaSanta Clara, CA
96d$224,000 - $356,500

About The Position

NVIDIA DGX Cloud is a managed, multi-cloud AI supercomputing service that provides enterprises with instant access to NVIDIA's high-performance AI infrastructure and software, including dedicated DGX AI supercomputing clusters, optimized software stacks, and expertise. The platform enables users to rapidly build, train, and deploy large-scale AI models across leading cloud providers like Oracle, Azure, and Google Cloud, eliminating the complexity of managing their own infrastructure. Key features include pre-trained and fine-tunable models, serverless GPU inference, and a unified interface for multi-cloud management. NVIDIA is looking for a passionate member to join our DGX Engineering Team as a Senior Software Engineer. In this role, you will play a significant part in helping to craft and guide the future of AI & GPUs in the Cloud. Are you passionate about cloud software development and strive for quality? Do you pride yourself in building cloud-scale software systems? If so, join our team at NVIDIA, where we are dedicated to delivering GPU-powered services around the world!

Requirements

  • Expertise in Kubernetes (K8s) & KubeVirt.
  • Expertise in Virtualization technologies such as Firecracker, KVM, OpenStack, Nutanix AHV & Redhat OpenShift.
  • Extensive experience with Golang and building RESTful web services.
  • Demonstrate understanding of cloud design in the areas of virtualization and global infrastructure, distributed systems, and security.
  • Experience with Docker and Containers.
  • Background with Infrastructure as Code.
  • Experience with AWS (Fargate, EC2, IAM, ECR, EKS, Route53 etc...).
  • Experience with Continuous Integration and Continuous Delivery.
  • BS or MS in Computer Science or equivalent experience with over 12+ years of hands-on software engineering.
  • Excellent interpersonal and written communication skills required.

Nice To Haves

  • Experience with Postgres.
  • Exposure to Helm Charts & Terraform.
  • A track record of solving complex problems with elegant solutions.
  • Prior experience with Rust & Python as well as demonstrate delivery of complex projects in previous roles.
  • Experience with load testing frameworks as well as experience with secrets management.

Responsibilities

  • Building restful cloud services and virtualization frameworks that form NVIDIA DGX Cloud Reference Architecture.
  • Designing, building, and implementing scalable cloud-based systems for PaaS/IaaS.
  • Working closely with other teams on new products or features/improvements of existing products.
  • Driving performance tuning and automation.
  • Supporting, maintaining, and documenting software functionality.

Benefits

  • Competitive salaries
  • Generous benefits package
  • Equity opportunities

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Industry

Computer and Electronic Product Manufacturing

Education Level

Bachelor's degree

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service