Senior/Staff AI Engineer

Data Direct Networks•San Francisco - Remote, CA

54d

About The Position

Build the AI infrastructure layer that determines whether modern models actually work in production. Most AI roles sit at the application layer. This one does not. At DDN, we’re hiring an AI Engineer to work on the hard part of AI: the systems, storage, and performance infrastructure behind real-world model serving and inference. This is the role for engineers who care about what happens under load, at scale, and in production — not just in demos. If your background sits at the intersection of AI infrastructure, distributed systems, and performance engineering, this is the kind of role where your depth will matter.

Requirements

An engineer who has spent meaningful time building or optimizing production AI systems, not just experimenting with models
Someone who understands how inference performance is shaped by the interaction between compute, memory, storage, and serving architecture
Deep hands-on experience working close to the systems layer — for example, improving how workloads run across GPU and CPU resources, reducing bottlenecks, or tuning infrastructure for better throughput and latency
Evidence of real ownership in areas like model serving, retrieval, caching, storage, or distributed performance, rather than purely application-layer AI work
The ability to move comfortably between architecture decisions and hands-on implementation, especially in environments where efficiency and scale matter
A background that suggests you can operate in technically demanding environments, whether that comes from AI infrastructure, high-performance systems, storage platforms, or adjacent distributed systems work

Nice To Haves

PhD preferred, but far less important than having built serious systems in the real world

Responsibilities

Build and optimize LLM serving and inference systems for production environments
Improve performance across GPU and CPU pathways
Work on KV cache, memory, storage, and throughput bottlenecks
Design and scale systems that support RAG and retrieval-heavy AI workloads
Contribute to infrastructure where storage architecture and systems efficiency materially affect AI performance
Solve engineering problems at the intersection of AI, high-performance systems, and distributed infrastructure