Forward Deployed Architect

NVIDIA•Santa Clara, CA

4d•$224,000 - $431,250•Hybrid

About The Position

NVIDIA is looking for a Forward Deployed Architect to provide technical leadership and strategic guidance across AI Accelerator engagements with AI Native organizations, NeoCloud Providers, and ISVs. You'll advise on architecture and integration, define what good looks like, and bring learnings back to inform the DSX product roadmap. This role is engaged when standard product capabilities are not enough and the work needs to be specialized. Our team works alongside customers and partners on infrastructure problems no one has solved yet, helping teams adopt NVIDIA technology the right way and shaping how new AI workloads get deployed.

Requirements

Bachelors degree or equivalent experience.
12+ years in technical roles such as solutions architecture, ML engineering, technical product management, or technical consulting across multiple customers or projects. Alternatively, 5+ years of specialist-level experience working at the frontier of AI infrastructure.
Strong technical leadership with the ability to guide teams and influence technical decisions without direct authority.
Systems thinking with the ability to understand customer outcomes and translate them into clear technical requirements and architectures.
Willingness to prototype, implement, validate, and troubleshoot hands-on when needed to solve critical problems or prove out approaches.
A solid technical foundation in the technologies AI infrastructure is built on, especially Linux systems administration.
A self-directed learner who can ramp on brand new technologies and unfamiliar technical domains independently.
Strong communication skills with the ability to engage technical teams, executives, and multi-functional collaborators.

Nice To Haves

Solutions architecture or technical consulting background across multiple customer engagements simultaneously, with experience bringing novel AI hardware or frameworks to production with frontier AI Native organizations, hyperscalers, NeoClouds, or ISVs.
A foundational cloud or distributed systems background built at hyperscaler scale.
A public technical voice: blog posts, talks, open-source contributions, or reference work that shows depth and opinion.
Hands-On Technical Expertise in one or more of: NVIDIA Stack (CUDA, NeMo, Triton, TensorRT, NIM, DGX Cloud, and the broader DSX software portfolio), Inference Systems (large-scale inference with frameworks like vLLM and SGLang, prefill-decode disaggregation, performance optimization across hardware), Training Systems (distributed training, model and pipeline optimization, open-source generative AI frameworks), Infrastructure (SLURM, Kubernetes, GPU scheduling, distributed computing frameworks, rack-scale systems, multiple CSP or NCP cloud environments), and Observability and Automation (CI/CD, infrastructure as code, GPU performance monitoring).

Responsibilities

Provide architectural direction across strategic engagements where standard capabilities are not enough and advanced implementation, optimization, or integration customization is needed.
Help customers integrate the right components to deliver on their outcomes. Where DSX software fits, advise on adopting it the right way. Where it doesn't, help them succeed with the right alternative and bring the gap back to product and engineering.
Dive into complex technical challenges hands-on when needed to solve critical problems, validate architectures, or prove out solutions.
Lead technically demanding programs end to end, including third-party performance benchmarking across hardware and workloads.
Identify common challenges and solution patterns across engagements. Share findings with internal teams and the broader AI community.
Develop standardized approaches, reference architectures, and structured guidance rooted in patterns from successful engagements.
Partner with product, engineering, and other customer-facing NVIDIA teams so what we learn in the field informs internal strategy and capabilities.
Design technical strategies for advanced AI workloads (distributed training, large-scale inference, model and pipeline optimization, MLOps) that apply across multiple customers and partners.
Help develop new infrastructure patterns and playbooks for the latest NVIDIA hardware as it lands with customers and partners.