Software Engineer III, AI/ML, GPU Inference, Optimization

GoogleSunnyvale, CA
72d$141,000 - $202,000

About The Position

Google's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. Our products need to handle information at massive scale, and extend well beyond web search. We're looking for engineers who bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day. As a software engineer, you will work on a specific project critical to Google's needs with opportunities to switch teams and projects as you and our fast-paced business grow and evolve. We need our engineers to be versatile, display leadership qualities and be enthusiastic to take on new problems across the full-stack as we continue to push technology forward. Cloud ML Compute Services (CMCS) team is within the Cloud organization that has been chartered to build x-Google alignment towards a unified central infrastructure to host all of Google's ML needs, including internal and external use cases. The CMCS Inference team is part of the CMCS team and focuses on the inference workloads and the serving infrastructure. In this role, you will be optimizing machine learning models for large scale inference workloads and will also have experience in different large scale Machine Learning (ML) optimizations techniques for improving latency and throughput. You will have experience with accelerators (TPUs or GPUs), or HPC.

Requirements

  • Experience with machine learning model optimization
  • Familiarity with large scale inference workloads
  • Experience with accelerators (TPUs or GPUs) or HPC
  • Strong background in software engineering principles
  • Ability to work on critical projects and switch teams as needed

Nice To Haves

  • Experience in information retrieval
  • Knowledge of distributed computing
  • Familiarity with large-scale system design
  • Understanding of networking and data storage
  • Experience in artificial intelligence and natural language processing

Responsibilities

  • Optimize machine learning models for large scale inference workloads
  • Work on inference workloads and serving infrastructure
  • Utilize different large scale Machine Learning (ML) optimization techniques for improving latency and throughput
  • Collaborate with cross-functional teams to address critical project needs
  • Display leadership qualities and take on new problems across the full-stack

Benefits

  • Bonus
  • Equity
  • Health benefits
  • Retirement plans
  • Paid time off

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Industry

Web Search Portals, Libraries, Archives, and Other Information Services

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service