About The Position

Millions of people around the world come to our platform to find creative ideas, dream about new possibilities and plan for memories that will last a lifetime. At Pinterest, we’re on a mission to bring everyone the inspiration to create a life they love, and that starts with the people behind the product. Discover a career where you ignite innovation for millions, transform passion into growth opportunities, celebrate each other’s unique experiences and embrace the flexibility to do your best work. Creating a career you love? It’s Possible. Staff Software Engineer, Ads ML Inference Infrastructure The Ads ML Inference Infra team owns the online inference and feature serving systems that power real-time model scoring and delivery for all Ads models at Pinterest. The team is looking for a staff engineer with strong hands-on experience in large-scale ML inference systems, as well as capabilities in solving ambiguous technical problems and driving strategic, cross-functional efforts.

Requirements

  • BS (or higher) degree in Computer Science or a related field.
  • ~8+ years of relevant industry experience designing and operating large-scale, production ML or distributed infra systems.
  • Deep knowledge of at least one programming language (Java, C++, Python).
  • Deep experience with distributed systems or recommendation / ads serving infrastructure (e.g., request routing, online storage, caching, feature serving, APIs).
  • Hands-on experience with at least one deep learning framework (PyTorch or TensorFlow) and bringing models from offline experimentation to production.

Nice To Haves

  • Experience with model / hardware accelerator libraries (e.g., CUDA, quantization, distillation, low-precision inference).
  • Experience with inference optimization and serving frameworks such as Triton, vLLM, or Dynamo.
  • Proven track record of leading complex projects, setting technical direction, and collaborating across functions and orgs; experience mentoring and coaching other engineers.

Responsibilities

  • Lead and drive efforts to build next-generation model inference and feature serving systems that power up to 100x larger models and directly uplevel Pinterest’s monetization business.
  • Design and optimize low-latency, high-throughput inference pipelines to meet strict SLOs while improving performance, efficiency, and cost.
  • Partner with Ads ML and product teams to productionize new model architectures (including LLMs and multi-stage ranking models) and scale them reliably to global traffic.
  • Evolve the online feature platform (feature computation, caching, and retrieval) to improve coverage, freshness, and consistency for Ads models.
  • Evaluate and integrate new technologies (e.g., GPU acceleration, model compression, Triton, vLLM, Dynamo) to advance our inference stack.
  • Build strong partnerships with other infra and ML teams to improve end-to-end reliability, observability, and developer velocity for Ads ML.
  • Mentor and coach other engineers, guiding them through technical decisions, system design, and career development.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service