Network Architect

SambaNova Systems•San Jose, CA

14h

About The Position

The era of pervasive AI has arrived. In this era, organizations will use generative AI to unlock hidden value in their data, accelerate processes, reduce costs, drive efficiency and innovation to fundamentally transform their businesses and operations at scale. About SambaNova Systems: Join the company that's building the future of AI computing. At SambaNova, we are disrupting the AI and high-performance computing space with our integrated hardware and software platform. Our DataScale systems and SambaFlow software are pushing the boundaries of what's possible with generative AI and large language models. We are a team of passionate innovators tackling some of the world's most challenging computational problems. The Opportunity: We are seeking a visionary Network Architect to define and drive the future of our hyperscale AI compute platform. In this critical role, you will architect the foundational networks powering next-generation AI workloads, from RDU-accelerated servers to global-scale AI clusters. You will sit at the intersection of hardware engineering, product strategy, and large-scale deployment, translating complex technical and customer requirements into scalable, cost-optimized, and high-performance infrastructure solutions.

Requirements

12+ years of experience designing, architecting, or productizing compute infrastructure for hyperscale, AI, HPC, cloud, or large-scale data center environments
Deep experience with AI compute platforms, including GPU/accelerator systems, high-density server architectures, rack-scale compute, and cluster-level design considerations
Strong understanding of server and rack-level architecture, including power, thermal, mechanical, serviceability, firmware, and platform integration requirements
Experience translating customer, workload, and deployment requirements into compute product requirements, platform roadmaps, technical specifications, and architecture tradeoffs
Familiarity with modern AI infrastructure components such as GPU servers, accelerator trays, NVLink/NVSwitch-class fabrics, PCIe/CXL, high-speed NICs, DPUs/IPUs, and storage-attached compute architectures
Deep working knowledge of AI cluster networking and fabric dependencies, including InfiniBand, RoCEv2, Ethernet-based AI fabrics, 400G/800G interconnects, and GPU-to-GPU east-west traffic patterns
Experience with lossless or near-lossless Ethernet designs, including QoS, priority mapping, ECN, PFC, congestion management, buffer tuning, telemetry, and failure-domain isolation
Strong understanding of spine-leaf network architectures and associated technologies such as VXLAN, EVPN, BGP, MLAG, ECMP, underlay/overlay design, multi-tenant segmentation, and network automation
Ability to design and evaluate solutions across multiple networking platforms and operating models, with a vendor-agnostic approach to architecture, platform selection, interoperability, and lifecycle strategy
Experience comparing and integrating technologies across leading switch, NIC, accelerator, optical, and fabric ecosystems, while avoiding unnecessary vendor lock-in
Familiarity with high-speed optics and cabling implications for AI clusters, including 400G/800G DR4, FR4, LR4, SR, DAC/AOC, fiber topology, link budgets, and transceiver interoperability
Ability to evaluate how network choices impact compute product requirements, including NIC selection, DPU/IPU integration, PCIe lane allocation, rack power/thermal design, cabling density, latency, throughput, resiliency, and serviceability
Ability to partner closely with hardware engineering, supply chain, manufacturing, operations, networking, facilities, vendors/OEMs/ODMs, and customer-facing teams to bring compute platforms from concept through deployment
Background supporting large-scale infrastructure operations, including fleet deployment, platform lifecycle management, reliability, observability, telemetry, automation, and field issue resolution
Strong product mindset, with the ability to balance performance, cost, manufacturability, deployment velocity, serviceability, supply availability, power efficiency, ecosystem flexibility, and long-term scalability

Nice To Haves

Sat at the intersection of AI compute architecture, product strategy, and hyperscale deployment —with enough network and fabric depth to understand how InfiniBand, RoCE, congestion control, platform interoperability, and spine-leaf design choices shape the compute products required for next-generation AI infrastructure.

Responsibilities

Define and drive the future of our hyperscale AI compute platform.
Architect the foundational networks powering next-generation AI workloads, from RDU-accelerated servers to global-scale AI clusters.
Translate complex technical and customer requirements into scalable, cost-optimized, and high-performance infrastructure solutions.
Design and evaluate solutions across multiple networking platforms and operating models, with a vendor-agnostic approach to architecture, platform selection, interoperability, and lifecycle strategy.
Evaluate how network choices impact compute product requirements, including NIC selection, DPU/IPU integration, PCIe lane allocation, rack power/thermal design, cabling density, latency, throughput, resiliency, and serviceability.
Partner closely with hardware engineering, supply chain, manufacturing, operations, networking, facilities, vendors/OEMs/ODMs, and customer-facing teams to bring compute platforms from concept through deployment.
Support large-scale infrastructure operations, including fleet deployment, platform lifecycle management, reliability, observability, telemetry, automation, and field issue resolution.

Benefits

Equity
95% premium coverage for employee medical insurance
77% premium coverage for dependents
Health Savings Account (HSA) with employer contribution
Dental insurance
Vision insurance
Short/Long term Disability insurance
Basic Life insurance
Voluntary Life insurance
AD&D insurance
Flexible Spending Account (FSA) options (Health Care, Limited Purpose, and Dependent Care)
Full subscription to Headspace
Gympass+ membership with access to physical gyms
One Medical membership
Counseling services with an Employee Assistance Program