GM-posted 2 days ago
Full-time • Mid Level
Sunnyvale, CA
5,001-10,000 employees

The Role You will be part of a core team that ensures safe, reliable, and scalable releases of the Autonomous Vehicle (AV) software stack through automation, data-driven reliability insights, and systematic validation processes. The mission is to accelerate the velocity and stability of AV releases by unifying software engineering, reliability analysis, and release automation under one cohesive framework. In this position, you will collaborate closely with Release Engineers, Systems Engineers, DevOps, and AI/ML teams to design and implement automated release validation pipelines, build metrics for release health and stability, and drive root-cause investigations of software and system issues impacting production readiness. If you are passionate about improving the safety, reliability, and velocity of ML-driven AV software releases through intelligent automation and systems thinking, we want to talk to you.

  • Design and implement automated release pipelines that integrate simulation, on-road, and CI/CD validation signals to assess software readiness.
  • Establish release reliability metrics and dashboards to quantify build quality, regression trends, and confidence of software deployment.
  • Collaborate with AI/ML, Simulation, and Systems Engineering teams to ensure robust and reproducible evaluation of release candidates.
  • Develop automated triage and failure analysis systems to identify and categorize root causes of reliability or stability regressions.
  • Integrate data pipelines for continuous monitoring of release health, including automated collection of test, simulation, and telemetry data.
  • Drive systematic improvements in release readiness criteria , defining measurable gates and pass/fail logic tied to product safety and reliability standards.
  • Develop frameworks for continuous release validation , ensuring each ML or software iteration is tracked, reproducible, and explainable.
  • Communicate insights and reliability findings to developers, QA, and leadership to influence roadmap prioritization and technical debt mitigation.
  • Map reliability and automation processes to the broader safety case framework, ensuring compliance with relevant standards and internal governance.
  • Strong proficiency in Python and SQL
  • Proven experience in CI/CD systems (e.g., GitHub Actions, Jenkins, GitLab, or equivalent)
  • Prior experience implementing ELT/ETL pipelines for quality monitoring, reliability, or release metrics
  • Solid understanding of system reliability concepts , including regression tracking, flakiness detection, and automated triage
  • Strong analytical, debugging, and problem-solving skills across large-scale software systems
  • Experience integrating simulation or hardware-in-loop testing into automated pipelines
  • Track record of cross-functional collaboration across engineering, QA, and operations teams
  • Ability to learn quickly and operate effectively in a dynamic, high-stakes environment
  • Excellent communication skills for presenting data-driven insights to engineering and leadership stakeholders
  • Bachelor’s, Master’s, or PhD in Computer Science, Electrical Engineering, Robotics, or a related field—or equivalent experience
  • Experience with release governance frameworks for ML or AV systems
  • Familiarity with reliability engineering methodologies (MTBF, FMEA, reliability growth analysis)
  • Knowledge of AV/ADAS software architectures and simulation validation loops
  • Experience building metrics pipelines in cloud environments (AWS, GCP, or Azure)
  • Familiarity with data visualization and observability tools (Grafana, Superset, Power BI, etc.)
  • Experience using Jira, GitHub Projects, or equivalent tools for release tracking and reliability triage
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service