Senior System Software Engineer – Dynamo Tools

NVIDIA•Santa Clara, CA

About The Position

We are seeking a Senior System Software Engineer to own and advance the AI-Perf analysis, NVIDIA’s flagship framework for benchmarking, experimentation, and analysis of LLMs, Generative AI, and deep learning inference workloads. In this role, you’ll combine systems research, distributed systems engineering, and applied AI, enabling reproducible performance evaluation, influencing internal platforms, and providing tooling that empowers researchers and engineers globally. What you’ll be doing: Lead the design, development, and roadmap of AI-Perf, defining benchmarking methodologies, performance metrics, and reproducible experimental workflows. Build scalable and high-performance features to measure latency, throughput, and efficiency across AI models and distributed systems. Partner with AI researchers, platform teams, and engineers to translate experimental challenges into robust, user-friendly performance tooling. Integrate AI-Perf with the Dynamo Inference Stack, other NVIDIA inference stacks, and open-source inference frameworks, delivering end-to-end performance insights for researchers and production users. What we need to see: Bachelor’s, Master’s, or PhD in Computer Science, Computer Engineering, or related field—or equivalent experience. 8+ years of experience in systems software, distributed performance engineering, or AI infrastructure research. Expert-level Python skills, including profiling, optimization, automation, and debugging of complex systems. Deep knowledge of distributed systems concepts, including scalability, concurrency, fault tolerance, and performance trade-offs. Ways to stand out from the crowd: Experience designing or maintaining performance benchmarking frameworks or tooling for AI/ML systems. Hands-on experience with LLMs and deep learning frameworks such as PyTorch, TensorFlow, TensorRT, or ONNX Runtime. Contributions to open-source or research projects in AI performance, infrastructure, or distributed systems. Experience running large-scale inference experiments across cloud and on-prem environments (AWS, Azure, GCP, bare metal). Why you’ll love this role Impact at scale: Your work will define how AI performance is measured, optimized, and understood by engineers and researchers worldwide. Innovation and ownership: Lead a critical tool used by internal teams and external partners, shaping the AI benchmarking ecosystem. Collaborative research environment: Work closely with world-class AI researchers, engineers, and platform architects on cutting-edge inference challenges. Visibility and growth: Contribute to tooling that powers publications, benchmarks, and industry-leading AI performance insights. With highly competitive salaries and a comprehensive benefits package, NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and hardworking people in the world working for us and, due to outstanding growth, our special engineering teams are growing fast. If you're a creative and autonomous engineer with a genuine passion for technology, we want to hear from you! Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 184,000 USD - 287,500 USD for Level 4, and 224,000 USD - 356,500 USD for Level 5. You will also be eligible for equity and benefits. Applications for this job will be accepted at least until January 16, 2026. This posting is for an existing vacancy. NVIDIA uses AI tools in its recruiting processes. NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. NVIDIA is the world leader in accelerated computing. NVIDIA pioneered accelerated computing to tackle challenges no one else can solve. Our work in AI and digital twins is transforming the world's largest industries and profoundly impacting society. Learn more about NVIDIA.

Requirements

Bachelor’s, Master’s, or PhD in Computer Science, Computer Engineering, or related field—or equivalent experience.
8+ years of experience in systems software, distributed performance engineering, or AI infrastructure research.
Expert-level Python skills, including profiling, optimization, automation, and debugging of complex systems.
Deep knowledge of distributed systems concepts, including scalability, concurrency, fault tolerance, and performance trade-offs.

Nice To Haves

Experience designing or maintaining performance benchmarking frameworks or tooling for AI/ML systems.
Hands-on experience with LLMs and deep learning frameworks such as PyTorch, TensorFlow, TensorRT, or ONNX Runtime.
Contributions to open-source or research projects in AI performance, infrastructure, or distributed systems.
Experience running large-scale inference experiments across cloud and on-prem environments (AWS, Azure, GCP, bare metal).

Responsibilities

Lead the design, development, and roadmap of AI-Perf, defining benchmarking methodologies, performance metrics, and reproducible experimental workflows.
Build scalable and high-performance features to measure latency, throughput, and efficiency across AI models and distributed systems.
Partner with AI researchers, platform teams, and engineers to translate experimental challenges into robust, user-friendly performance tooling.
Integrate AI-Perf with the Dynamo Inference Stack, other NVIDIA inference stacks, and open-source inference frameworks, delivering end-to-end performance insights for researchers and production users.