About The Position

We're building a group of innovators to assist enterprises in deploying and accelerating NVIDIA’s three computer workloads for Physical AI. These include robotics simulation, synthetic data generation, multi-step model training, and inference, all on a large scale! We are seeking a hands-on Solutions Architect with deep expertise in backend infrastructure, inference and cloud-native applications to design and scale Kubernetes-native environments for distributed AI/ML workloads. This role offers an outstanding chance to build within the rapidly growing field of Robotics AI & Simulation. You’ll work closely with our product management, engineering, and business teams to drive the adoption of NVIDIA's groundbreaking Physical AI technologies with our key ecosystem partners!

Requirements

  • 5+ Years of experience in Solution Architecture or Infrastructure Engineering, advancing AI/ML systems from proof of concept to production on private/public cloud environments.
  • Experience with scaling Robotics workloads in one or more areas, such as VLM/VLA model training, model inference, robot learning and simulation, data generation.
  • Strong expertise in networking (DNS, LB, TCP/IP, firewalls), storage technology, workflow orchestration softwares (Airflow, Argo, etc), modern DevOps practices (GitOps, IaC, Observability), and orchestrating efficient GPU workloads using the NVIDIA GPU Operator and MIG.
  • Excellent communication skills to convey technical concepts to diverse audiences.
  • BS in Computer Science, Computer Engineering, or a related field, or equivalent experience.

Nice To Haves

  • Proficiency with robotics frameworks (e.g., ROS2) and NVIDIA simulation and AI platforms such as Isaac Lab, Isaac Sim or Cosmos.
  • Experience with AI/ML training workflows and distributed job orchestration using tools like Ray.
  • Deep expertise of transformer networks and experience deploying NVIDIA inference technologies (Dynamo, NIM, Triton, vLLM) using acceleration techniques like quantization.
  • Experience with large scale data curation techniques and optimization
  • Broad technical expertise across networking, compute, and storage systems (e.g., S3, NFS, Lustre), with hands-on experience building and debugging APIs (REST, gRPC) as well as r elevant certifications such as NVIDIA Certified AI Engineer, Certified Kubernetes Administrator (CKA), or Cloud Solutions Architect.

Responsibilities

  • Support customers in building scalable and observable GPU-accelerated pipelines for key robotics workloads using Kubernetes, cloud-native technologies, and NVIDIA frameworks (OSMO, Dynamo) across heterogeneous infrastructure.
  • Develop a deep understanding of robotics workloads scaling and help translate those into optimal architectures for partners
  • Collaborate with DevOps teams to orchestrate data preprocessing, distributed training and inference workloads to optimize job scheduling, costs, storage access, and networking across hybrid and multi-cloud Kubernetes environments (e.g., AWS, Azure, GCP, on-prem).
  • Accelerate inference pipelines using NVIDIA NIM, TensorRT-LLM, vLLM, SGLang, and other engines to enable seamless, disaggregated inference architectures.
  • Collaborate with multi-functional teams (business, engineering, product) and provide technical mentorship to customers implementing Physical AI at scale.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service