Solutions Architect, AI Infrastructure

NVIDIAToronto, ON
Hybrid

About The Position

NVIDIA is seeking an experienced AI Infrastructure Solutions Architect (SA), bridging design to deployment of large-scale GPU infrastructure. As part of the NVIDIA SA organization, you will be interacting with customers, partners, and internal teams to analyse, define, and implement large-scale AI/HPC projects, as well as offering recommendations to business and engineering teams on our product roadmap.

Requirements

  • BS/MS/PhD in Electrical/Computer Engineering, Computer Science, Physics, Mathematics, or other Engineering fields or equivalent experience.
  • 5+ years of Solution Engineering (or similar Sales Engineering, Cloud Engineering, Solution Architecture) including experience working directly with partners and customers.
  • System level expertise of CPU/GPU server architecture, NICs, Linux, system software and kernel drivers.
  • Experience with networking switches for Ethernet/Infiniband, and Data Center infrastructure (power/cooling).
  • Knowledge of DevOps/MLOps technologies such as Docker/containers, Kubernetes.
  • Efficient time management and capable of balancing multiple tasks.
  • Excellent presentation, communication and collaboration skills.
  • Self-starter with a passion for growth, continuous learning, and sharing insights.

Nice To Haves

  • Familiarity with NVIDIA GPUs, NVIDIA Networking technologies (e.g. NICs, RoCE, InfiniBand), and systems technology such as NCCL, DCGM, UFM, Mission Control, and Base Command Manager.
  • Experience with bringup and deployment of large GPU clusters, including deploying and optimizing high-speed networks (InfiniBand/Ethernet), with a clear understanding of how network architecture impacts GPU cluster performance.
  • Systems engineering, coding, and debugging skills including experience with C/C++, Linux kernel and drivers.
  • Experience working with enterprise developers and strong customer-facing skills.

Responsibilities

  • Working with NVIDIA Cloud Partners in Canada on large data center GPU server and networking system deployments.
  • Guide customer discussions on network design, compute/storage, and support bring up of server/network/cluster deployments.
  • You will need to visit customer data center during bring up phase.
  • Become the primary technical driver for customers during the design, development, construction, integration, and production of GPU Cloud infrastructure and applications throughout the entire customer lifecycle.
  • Work as the customer's trusted advisor conducting regular technical customer meetings for product roadmap, cluster issue debugging, feature discussions and introduction to new technology solutions.
  • Partner with other SAs, Account Managers, Engineering, Product, and business leaders to align on strategies, assess technical needs, and secure business opportunities for NVIDIA.
  • Analyze and debug compute/network configuration and performance issues to deliver performant clusters.
  • Prepare and deliver technical content to customers including presentations, workshops, reference architectures, tutorials, publications.

Benefits

  • highly competitive salaries
  • comprehensive benefits package
  • equity
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service