Software Engineer, Generative AI and LLMs, YouTube

GoogleSan Bruno, CA
12h$197,000 - $291,000

About The Position

Google's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. Our products need to handle information at massive scale, and extend well beyond web search. We're looking for engineers who bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day. As a software engineer, you will work on a specific project critical to Google’s needs with opportunities to switch teams and projects as you and our fast-paced business grow and evolve. We need our engineers to be versatile, display leadership qualities and be enthusiastic to take on new problems across the full-stack as we continue to push technology forward. The YouTube GDA (Generative Data Analytics) team in YouTube Data Organization is democratizing the data ecosystem by leveraging Large Language Models (LLMs) to transform complex data into actionable intelligence. As a Software Engineer, you will architect the semantic layer that powers GDA products like YouTube SQL Assistant, YouTube Debugger, etc. You will define how our models understand YouTube’s massive data landscape, owning the end-to-end strategy for context retrieval. At YouTube, we believe that everyone deserves to have a voice, and that the world is a better place when we listen, share, and build community through our stories. We work together to give everyone the power to share their story, explore what they love, and connect with one another in the process. Working at the intersection of cutting-edge technology and boundless creativity, we move at the speed of culture with a shared goal to show people the world. We explore new ideas, solve real problems, and have fun — and we do it all together. The US base salary range for this full-time position is $197,000-$291,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process. Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google .

Requirements

  • Bachelor's degree or equivalent practical experience.
  • 8 years of experience in software development.
  • 5 years of experience testing, and launching software products
  • 3 years of experience with software design and architecture.
  • 5 years of experience with machine learning algorithms and tools (e.g. TensorFlow), artificial intelligence, deep learning, or natural language processing.

Nice To Haves

  • 8 years of experience with data structures and algorithms.
  • Experience designing and implementing complex context retrieval systems (context scaffolding) for large language models in production.
  • Experience leading technical strategy for large-scale projects and mentoring executive engineers.
  • Proficiency in SQL.
  • Understanding of vector space models, semantic search, and experience evaluating embedding techniques (e.g., dense versus sparse retrieval, re-ranking strategies).
  • Ability to independently query massive datasets to diagnose model behavior, identify edge cases, and drive data-informed architectural decisions.

Responsibilities

  • Design and implement dynamic context construction strategies for RAG (Retrieval Augmented Generation) systems.
  • Solve challenges related to limited context versus massive schema documentation, optimizing for precision, recall, and cost.
  • Lead the evaluation and fine-tuning of embedding strategy (dense, sparse, and hybrid) to capture domain-specific YouTube terminology.
  • You will decide how we represent data tables, column definitions, and debug logs in vector space.
  • Build data pipelines that keep our semantic index fresh in real-time as YouTube’s data schemas evolve.
  • Mentor Senior Engineers, drive technical roadmap planning, and collaborate with partners in Google DeepMind and YouTube infrastructure to adopt research.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service