Principal Architect, AI Networking

NVIDIAAustin, CA
12dHybrid

About The Position

NVIDIA has been redefining computer graphics, PC gaming, and accelerated computing for more than 25 years. It’s a unique legacy of innovation that’s fueled by great technology—and amazing people. We're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world! As a Software Architect for AI Networking you will part of a groundbreaking team driving technological innovation! This is an outstanding opportunity to craft the future of AI networking by using cutting-edge technologies to address sophisticated challenges. We work in a highly dynamic environment where your contributions will significantly influence AI infrastructure.

Requirements

  • Master's or Ph.D. in Computer Science, Electrical or Computer Engineering from a top-tier university (or related field) (or equivalent experience).
  • 12+ years of relevant academic or proven experience in the field.
  • Comprehensive understanding of AI workloads (primarily inference, but also training) and their impact on network infrastructure.
  • Strong proficiency in Machine Learning/Deep Learning fundamentals, inference runtimes, and Deep Learning frameworks.
  • Skilled in C or C++ for systems software development; familiarity with Rust is helpful.
  • Curiosity for building leading edge technology.
  • Ability to work and communicate effectively across diverse teams with varying expertise and time zones.

Nice To Haves

  • Proven research track record
  • Experience in LLM inference, AI network and storage needs.
  • Background in storage and storage optimization: file systems, object store, caches, coherency.
  • ⁠ ⁠Stellar communication skills.

Responsibilities

  • Work on accelerating NVIDIA Dynamo - KV cache management and large-scale inference.
  • Developing and researching groundbreaking networking technologies to advance and scale AI networks.
  • Co-designing software and hardware networking solutions across various networking related domains, from network transports to AI frameworks.
  • Working closely with NVIDIA's hardware architecture, software architecture, and research teams to build innovative networking hardware and software solutions.
  • Leading the development of prototypes that optimize AI training and inference infrastructure.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service