Software Engineer, TPU Compiler Development Infrastructure

GoogleSunnyvale, CA
1d$147,000 - $211,000

About The Position

Our team develops the Accelerated Linear Algebra (XLA) compiler which enables TPUs, Google's in-house custom designed processor, to accelerate machine learning and other scientific computing workloads for both internal Google customers and external Cloud customers. The XLA TPU team is reaching a critical threshold of complexity at a time when the demand for rapid iteration has never been higher. This role is designed to manage the infrastructure friction that compiler engineers face daily, effectively multiplying output of the entire team. In concrete terms we need to pull down the average team presubmit latency from the current 1.5. hours to 20 min and minimize Changelist (CL) rollback (catch issues early). While this position does not require prior experience with compilers, hardware, or deep ML expertise, bout it does require someone who genuinely enjoys the craft of building great infrastructure unblocking developer productivity. The US base salary range for this full-time position is $147,000-$211,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process. Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google .

Requirements

  • Bachelor’s degree or equivalent practical experience.
  • 2 years of experience with coding in C++ and Python, or 1 year of experience with an advanced degree.
  • 2 years of experience working with Google Infrastructure such as Blaze, TAP, or Guitar.

Nice To Haves

  • Master's degree or PhD in Computer Science, or a related technical field.
  • Interest in becoming an expert in infrastructure surrounding low-level ML hardware programming.

Responsibilities

  • Reduce CL time to submit for a CL and minimize CL rollback for the whole XLA TPU team.
  • Drive infrastructure improvements that remove friction from the daily development of the XLA TPU Compiler team.
  • Develop tools supporting compiler engineers as they work through stages of new TPU introduction (e.g., testing when hardware is not yet available or very limited).
  • Modernize and simplify build/test fixtures (e.g. xla_test) to make them more reliable and easier for the team to use.
  • Design and implement system architectures which cleanly handle ever increasing number of TPU generations and compiler features, ensuring the codebase doesn't become a "spaghetti" of special cases.
  • Identify and resolve accelerator utilization bottlenecks, improve accelerator test coverage without slowing down CL submission.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service