About The Position

Today, we’re tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what’s never been done before takes vision, innovation, and the world’s best talent. As an NVIDIAN, you’ll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world. NVIDIA is seeking an extraordinary innovator in networking and system architecture to join our Research team. In this role, you will focus on designing networks optimized for AI systems, as well as leveraging AI and machine learning to advance network architecture and enable intelligent, real-time decision-making for tasks such as control, routing, congestion management, and scheduling. We are seeking a balanced background of research excellence in building systems and a deep understanding and broad perspective across the fields of computer architecture and communication systems for distributed computation. NVIDIA is a technology leader in systems for Artificial Intelligence, and this research team invents new networking technologies to advance the performance, scalability, and efficiency of next-generation computing platforms. This position offers you the opportunity to have a real impact while working with some of the most creative and forward-thinking people in the world who are here at this dynamic, technology-focused company.

Requirements

  • Pursuing or recently completed a PhD in relevant discipline(s) (CS, CE, EE, Physics, Math) or equivalent experience.
  • 2+ years of relevant industrial and academic experience preferred.
  • Relevant experience includes systems and network design for AI or HPC infrastructure and/or applying AI/ML to systems or networking problems (e.g., optimization, reinforcement learning, graph learning, telemetry-driven control).
  • Background and publication record in systems, networking, computer architecture, and/or ML for networking/systems.
  • Publication at venues such as ISCA, HPCA, MICRO, SIGCOMM, NSDI (and related venues) is a plus.
  • Signals of fit: evidence via publications plus artifacts (open-source, prototypes, production deployments) demonstrating impactful contributions to systems or network design for AI infrastructure, including where ML/RL informs design or drives online decisions.
  • Experience with AI/ML methods for systems and networks supporting AI workloads (e.g., PyTorch/TensorFlow/JAX) is valuable. It includes applying these methods to system or network building, simulation, optimization, and control/decision loops.
  • Strong programming and prototyping ability, with experience building research artifacts, simulators, or system prototypes; C++ and Python preferred, and experience with hardware description languages or HLS is desirable.

Responsibilities

  • Develop innovative network architectures, algorithms, and hardware/software co-design approaches for high-performance interconnects and large-scale distributed AI systems, including AI-assisted methods where appropriate, to enable efficient, scalable, and robust communication and extend the state of the art in networking, distributed computing, and system architecture.
  • Create and evaluate mechanisms for network and system decision-making in large-scale GPU/accelerator clusters, including AI/ML-based approaches for routing, traffic engineering, congestion control, scheduling, topology design, and telemetry-driven control loops.
  • Invent new techniques, technologies, methodologies, processes, and devices, to enable new products or types of products. Deliverable results include prototypes, patents, publications, and product impact.
  • Prototype new ideas in simulation or through analytical modeling.
  • Produce technology vision and the basis for products 5-10 years out. Focus should not be on products currently shipping or in development, except as to how they can be extended and improved.
  • Participate in the broader research community. Examples are serving as a reviewer or on Program Committees, publishing papers and speaking at conferences.
  • Collaborate with external researchers, primarily in academia, to encourage mutually beneficial work.

Benefits

  • You will also be eligible for equity and benefits

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Entry Level

Education Level

Ph.D. or professional degree

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service