Onboard Software Performance TLM / Staff SWE

Waymo-posted about 17 hours ago

Full-time • Mid Level

Hybrid • Mountain View, CA

Resume

Match Score

Upload and Match ResumeTrack Jobs with Teal

Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Waymo Driver—The World's Most Experienced Driver™—to improve access to mobility while saving thousands of lives now lost to traffic crashes. The Waymo Driver powers Waymo’s fully autonomous ride-hail service and can also be applied to a range of vehicle platforms and product use cases. The Waymo Driver has provided over ten million rider-only trips, enabled by its experience autonomously driving over 100 million miles on public roads and tens of billions in simulation across 15+ U.S. states. Onboard Software Performance team ensures that systems running on ADV (Autonomously Driven Vehicle) meet strict performance requirements such as producing necessary outputs within strict latency targets and using an appropriately allocated amount of compute resources (CPU/GPU/TPU/RAM etc) for each respective submodule. All of this needs to be done at scale with performance guarantees of many 9s of reliability while enabling high velocity of system evolution. In this hybrid role, you will report to a Senior Staff Engineer, Technical Lead Manager. You will: Develop ADV's modular architecture improvements and frameworks that maximize performance and compute utilization and ROI for driving quality Evolve our compute usage on the car and simulation to enable continued scaling where the system runs fast on the car and efficiently in our data center Collaborating with onboard teams to identify and improve compute performance bottlenecks across the stack to improve performance/driving quality Collaborating with hardware teams to codesign hardware/software and optimize the software for best performance on our hardware platform Ensuring our performance is strong at any driving complexity including as we scale to even more complex driving environments and encounter rarer events Ensuring state of the art reaction latency for collision avoidance via novel system/architecture designs and extremely fast nominal performance Developing necessary high scale performance evaluation, debugging and software change management processes Optimizing system resource usage to simulation at scale in Cloud datacenters: minimizing CPU utilization and latency, minimizing RAM consumption, intelligently determining which computations should happen on CPU, GPU, and TPU.

Develop ADV's modular architecture improvements and frameworks that maximize performance and compute utilization and ROI for driving quality
Evolve our compute usage on the car and simulation to enable continued scaling where the system runs fast on the car and efficiently in our data center
Collaborating with onboard teams to identify and improve compute performance bottlenecks across the stack to improve performance/driving quality
Collaborating with hardware teams to codesign hardware/software and optimize the software for best performance on our hardware platform
Ensuring our performance is strong at any driving complexity including as we scale to even more complex driving environments and encounter rarer events
Ensuring state of the art reaction latency for collision avoidance via novel system/architecture designs and extremely fast nominal performance
Developing necessary high scale performance evaluation, debugging and software change management processes
Optimizing system resource usage to simulation at scale in Cloud datacenters: minimizing CPU utilization and latency, minimizing RAM consumption, intelligently determining which computations should happen on CPU, GPU, and TPU.
Define roadmap/portfolio of projects optimizing for end results overall across all aspects of the problem space
Setup collaboration structures across team boundaries

BS/MS in Comp Sci, EE, Robotics, Physics, Math, or related field (or equivalent experience)
6 years of software engineering experience on large scale/high complexity system (supported by hundreds of engineers)
2 years of software management experience with at least 4 years in infrastructure/systems/performance domain optimizing end to end system for high performance metrics
4 years of experience acting as technical lead in performance/software infrastructure domain
4 years of experience in C++

Experience in robotics
Experience in low level optimization techniques, frameworks (SIMD/CUDA) and ML performance/frameworks
Experience in large scale evaluation techniques/data science and building performance metrics/tooling
Experience in large scale software re-architecture projects

Track Jobs with Teal

Job Search Resources

•

Resume Builder

•

Resume Examples

•

Cover Letter Examples

Onboard Software Performance TLM / Staff SWE

Job Search Resources

Tools

Career Hubs

Guides

Company