Senior Storage Performance Engineer

NvidiaSanta Clara, CA
38d

About The Position

NVIDIA is in search of a highly skilled Senior Storage Performance Engineer to join our ambitious team in Santa Clara, CA. This role is essential as we continue to push the boundaries of AI and HPC technologies. You will have the chance to create, implement, and analyze complex benchmarks to optimize performance across NVIDIA's infrastructure stack. Your efforts will directly impact the efficiency and success of our AI inference and training, NVIDIA NIMs, RAG pipelines, HPC codes, and storage platforms, contributing significantly to our innovative journey.

Requirements

  • 12+ years of experience in performance engineering, benchmarking, or HPC/AI systems.
  • Deep expertise in AI/ML and deep learning frameworks (PyTorch, TensorFlow, Triton).
  • Strong background in storage systems and filesystems.
  • Proven experience with MPI, OpenMP, and Slurm in large-scale compute environments.
  • Proficiency in Python, Bash, and automation frameworks for job orchestration and results parsing.
  • Excellent communication skills; ability to context-switch between deep technical work and high-level business impact.
  • BS, MS, or PhD or equivalent experience in Computer Science, Electrical Engineering, or related field.

Nice To Haves

  • Experience with RAG pipelines and vector databases (FAISS, Milvus, Qdrant).
  • Familiarity with Kubernetes and CSI-based persistent storage systems.
  • Knowledge of GPU profiling tools (Nsight Systems, PyTorch Profiler).
  • Experience with telemetry/monitoring frameworks (Prometheus, Grafana).
  • Enthusiastic about exploring the boundaries of AI, HPC, and storage capabilities!

Responsibilities

  • Crafting and delivering performance benchmarks across AI, HPC, and enterprise storage platforms.
  • Testing and benchmarking storage appliances (block, file, object) against NVIDIA data center solutions.
  • Operating and adjusting AI inference and training workloads with tools like PyTorch, TensorFlow, and NVIDIA NIMs.
  • Benchmarking and analyzing retrieval-augmented generation (RAG) pipelines, including ingestion, retrieval, and inference performance with vector databases.
  • Profiling and optimizing MPI-based and multi-node distributed applications.
  • Collaborating closely with product managers, system architects, and partners to fine-tune hardware/software stack performance.

Benefits

  • You will also be eligible for equity and benefits.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Industry

Computer and Electronic Product Manufacturing

Education Level

Ph.D. or professional degree

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service