Distinguished Engineer - AI Infrastructure We are seeking a Distinguished Engineer with unrivaled depth in AI/ML inferencing at scale and the distributed systems foundations that power it. You will architect and ship our next-generation AI infrastructure and inferencing platform serving millions of requests with uncompromising latency, throughput, and reliability requirements. You set the technical North Star—translating high-stakes business problems into elegant, defensible architectures that teams rally behind. You drive consensus through technical authority, shaping roadmaps where your architectural decisions become company strategy. You invent solutions where standard approaches fail, and turn constraints into lasting competitive moats. Core Expertise Required: AI Inferencing & ML Systems: Deep hands-on experience with high-performance inference engines (TensorRT, vLLM, ONNX Runtime, Triton), model optimization (quantization, pruning, distillation), and serving patterns for LLMs and computer vision models at scale. Proven track record building, architecting RAG pipelines and optimizing retrieval-augmented generation workflows for production latency targets. Distributed Systems at Scale: 15+ years architecting fault-tolerant, low-latency distributed systems. Expert-level understanding of consensus protocols, distributed state management, and data consistency models under partition. Experience with high-performance filesystems and storage engines optimized for AI workloads (checkpoints, model artifacts, training datasets). AI Infrastructure & Platform Engineering: Built enterprise-grade or SaaS platforms specifically designed for AI/ML workloads—model registries, feature stores, inference gateways, and multi-tenant serving infrastructure. Deep familiarity with GPU/TPU cluster orchestration, memory hierarchy optimization, and heterogeneous compute scheduling. High-Performance Data Planes: Designed and implemented high-throughput, low-latency networking stacks for critical data path operations. Expertise in RDMA, DPDK, kernel bypass techniques, and custom protocols for inter-service and accelerator-to-accelerator communication. Security & Multi-Tenancy: Hardened multi-tenant ML infrastructure with robust isolation, end-to-end encryption, key management for model weights, and fine-grained RBAC/ABAC for data scientists and production workloads. Cloud-Native Orchestration: Expert in Kubernetes scheduling extensions (device plugins, custom controllers), service mesh for AI microservices, and API gateway patterns for model serving. Job Requirements About the team ONTAP is NetApp’s flagship storage operating system. The ONTAP team drives the product strategy, roadmap, and engineering delivery for ONTAP software and systems. You are responsible for developing innovative solutions and architecture for ONTAP software and systems spanning the areas of filesystems and storage, security, networking and protocols. The solutions you architect and design will drive mission critical applications, AI infrastructure and cloud workflows for Fortune 500 companies.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
No Education Listed