About The Position

We are now looking for a Senior System Software Engineer to work on Dynamo. NVIDIA is hiring software engineers for its GPU-accelerated deep learning software team. Academic and commercial groups around the world are using GPUs to power a revolution in AI, enabling breakthroughs in problems from image classification to speech recognition to natural language processing. We are a fast-paced team building back-end services and software to make design and deployment of new AI models easier and accessible to all users. What you'll be doing: In this role, you will develop open source software to serve inference of trained AI models running on GPUs. You will balance a variety of objectives: build robust, scalable, high performance software components to support our distributed inference workloads; work with team leads to prioritize features and capabilities; load-balance asynchronous requests across available resources; optimize prediction throughput under latency constraints; and integrate the latest open source technology.

Requirements

  • Masters or PhD or equivalent experience
  • 3+ years in Computer Science, Computer Engineering, or related field
  • Ability to work in a fast-paced, agile team environment
  • Excellent Rust/Python / C++ programming and software design skills, including debugging, performance analysis, and test design.
  • Experience with high scale distributed systems and ML systems

Nice To Haves

  • Prior work experience improving performance of AI inference systems.
  • Background with deep learning algorithms and frameworks. Especially experience Large Language Models and frameworks such as PyTorch, TensorRT, and ONNX Runtime.
  • Experience building and deploying cloud services using HTTP REST, gRPC, protobuf, JSON and related technologies.
  • Have familiarity with the latest AI research and working knowledge of how these systems are efficiently implemented.

Responsibilities

  • develop open source software to serve inference of trained AI models running on GPUs
  • build robust, scalable, high performance software components to support our distributed inference workloads
  • work with team leads to prioritize features and capabilities
  • load-balance asynchronous requests across available resources
  • optimize prediction throughput under latency constraints
  • integrate the latest open source technology

Benefits

  • Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.
  • The base salary range is 152,000 USD - 218,500 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.
  • You will also be eligible for equity and benefits.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service