About The Position

We are looking for a senior systems software engineer to improve the operation and user experience of distributed system infrastructure using AI. We are passionate about the opportunity to shape the future of the NVIDIA software platform! We want to develop next-generation AI-powered infrastructure that scales to run millions of requests and jobs on thousands of servers efficiently, reliably, and securely.

Requirements

  • Bachelor’s degree or equivalent experience.
  • More than 8 years dedicated to software engineering in one or more programming languages, including data structures and algorithms.
  • 2 years of experience designing, testing, maintaining and/or launching distributed systems.

Nice To Haves

  • Experience with Agentic AI frameworks and patterns.
  • Background with container technologies such as Docker and Kubernetes.
  • Experience with DevOps tools such as Ansible, Terraform, and Jenkins.
  • Knowledge of the Java programming language and web frameworks such as Spring Boot.

Responsibilities

  • Design, build, test, deploy, and maintain software infrastructure APIs and services.
  • Collaborate effectively with peers and partners through brainstorming and review sessions to produce high-quality design, code, and documentation.
  • Review, recommend, and model best practices of distributed systems for reliability, performance, monitoring, security, compliance, interoperability, usability, correctness, consistency, simplicity, etc.
  • Support other NVIDIA teams in using this foundational infrastructure.
  • Debug production issues across services and multiple levels of the stack.

Benefits

  • You will also be eligible for equity and benefits.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service