About The Position

NVIDIA is seeking a Solution Architect to join their team, focusing on revolutionizing AI with data center scale solutions. The role involves designing, building, and maintaining large-scale HPC and AI infrastructure. Solution Architects at NVIDIA help make AI Factories a reality by working closely with customers and partners to address unsolved industry problems, deploying and operationalizing AI solutions at scale. Day-to-day work includes enabling partners to adopt end-to-end AI solutions using NVIDIA's compute, networking, and software stacks, with a particular emphasis on CPU-based solutions within the NVIDIA AI Factory. This multi-faceted role requires comfort with hardware, software, AI workflows, and operationalization of large-scale compute resources. The goal is to help customers overcome barriers to adopting NVIDIA's best known methods, with the Solution Architect acting as a technical leader for CPU components. The team also focuses on knowledge sharing through demos, proof-of-concepts, papers, and developer blogs, collaborating with executives and engineering to solve complex problems and bring NVIDIA's technologies to life. NVIDIA is a leader in accelerated computing, transforming industries with AI and digital twins.

Requirements

  • Experience with defining, deploying, and testing large scale reference architectures for High Performance Computing and AI
  • A track record of defining and using MLOps and AI workflow tools and processes
  • 6 or more years of hands-on expertise with modern data center architectures and interaction between CPUs, GPUs, and networking
  • Strong foundational expertise and a BS, MS, or equivalent experience in Engineering
  • Strong analytical and problem-solving skills, along with an ability to articulate what you know to others
  • Ability to multitask efficiently in a multifaceted environment
  • Experienced with organizing, presenting, and discussing technical materials with groups that can be comprised of a range of technical capability
  • Flexibility to adapt in fluid situations, especially with partners or customers
  • Comfortable with occasional travel to customer sites

Nice To Haves

  • Hands-on experience with Arm-based server processors and the Arm software ecosystem
  • Proficiency with tooling, automation, and performance testing for large-scale clusters, preferably using AI tools
  • Deep understanding of Agentic AI and inference workflows
  • Experience building, using, and explaining reinforcement learning
  • Willingness and ability to learn quickly as we address sophisticated problems, and an understanding of how all elements of the AI Factory interact with each other

Responsibilities

  • Design, build, and maintain large-scale HPC and AI infrastructure
  • Work closely with customers and partners to address unsolved problems in the industry
  • Help to deploy and operationalize AI solutions at scale
  • Help partners be successful in their adoption of end-to-end AI solutions using NVIDIA's compute, networking, and software stacks
  • Possess a deep technical understanding of NVIDIA Reference Architectures and use that understanding to enable customers adopting CPU-based solutions as part of the overall NVIDIA AI Factory
  • Work on hardware and software elements, the larger AI workflow, and operationalization of large scale compute resources
  • Help customers overcome barriers to adopting NVIDIA's best known methods
  • Play an instrumental role as the technical leader for the CPU components within the NVIDIA AI Factory
  • Share knowledge with colleagues, delivering demos, assisting with proof-of-concepts, or writing papers and developer blogs
  • Collaborate with executives and engineering to tackle sophisticated problems and help bring NVIDIA's premiere technologies to life
  • Solve problems that nobody else has solved yet

Benefits

  • Eligible for equity and benefits
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service