About The Position

For more than 25 years, NVIDIA has changed the landscape of digital imaging, personal gaming, and high-performance computing. Our success depends on reliable, informative telemetry and data systems that provide real-time understandings of our sophisticated, distributed infrastructure. As an engineer on our team, you will play a key role in building the next generation of observability for a diverse set of sophisticated workloads. You will transform raw telemetry data into actionable insights. You will architect, develop, and maintain infrastructure that supervises workload health, performance, and usage in critical engineering systems. This allows our global teams to work at peak efficiency. This role offers an outstanding mix of core software engineering, data management, and workload observability.

Requirements

  • Candidates must hold a BS or above degree in Computer Science or equivalent experience
  • Minimum 4+ years of professional experience developing and managing observability infrastructure.
  • Familiarity with EDA (Electronic Design Automation) workflows and tools used in the semiconductor industry.
  • Proficiency in programming and scripting using Python, Perl.
  • Familiarity with databases, containerized applications, observability stack components.
  • Experience in building data pipelines for a compute cluster using open-source technologies and building custom components as vital.
  • Experience with C++ is a plus.
  • Solid grasp of software engineering principles and methodologies such as OOP, CI/CD.
  • Ability to translate ambiguous problems into concrete solvable pieces.
  • Excellent communication and collaboration skills.
  • Ability to adapt in a fast-paced environment with evolving requirements.

Nice To Haves

  • Background knowledge in accelerated computing (parallel programming) or experience running CPU-vectorized or GPU-based workloads, even if not directly tied to observability.
  • Hands-on experience in developing user interfaces using technologies such as HTML, CSS, JS, ReactJS or VueJS.
  • A passion for improving engineering productivity and efficiency with a data-driven philosophy.

Responsibilities

  • Collaborate closely with internal chip design teams to understand their workflows and determine observability needs to help improve the overall efficiency of our chip development process.
  • Compose, build and maintain robust and scalable platforms and infrastructures for capturing, storing, visualizing and processing the data collected from chip build workflows.
  • Maintain and update the observability tools and systems to meet the needs of new/evolving chip design workflows.
  • Keep up to date with recent developments in the area related to observability tools, frameworks and strategies and advocate for their integration within the organization.

Benefits

  • You will also be eligible for equity and benefits
  • NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service