Senior Machine Learning Engineer

AdobeSan Jose, CA
1d$172,500 - $306,625

About The Position

Our focus is developing AI technologies for text, images, and videos to boost creativity. We're seeking an outstanding ML infra engineer with deep expertise in building large scale foundation models infrastructures that support all the generative AI efforts in Firefly! This is a chance to create a huge impact in a fast-paced, startup-like environment in a great company. Join us! The position involves building infrastructures touching various components of our foundation model stack. including large scale data processing, scalable and reliable PyTorch training infrastructures, GPU optimizations with custom CUDA kernels on the latest Nvidia GPUs, and more!

Requirements

  • Graduate, PhD, or postgraduate degree in Computer Science, Computer Engineering, or a related field—or equivalent experience.
  • 5+ years ML Engineering experience, specializing in generative AI like LLMs.
  • Strong Python and deep learning engineering skills, paired with experience in training and inferencing with PyTorch or TensorFlow, will be essential.
  • Familiarity with distillation, transformers, and diffusion models.
  • Knowledge of deployment technologies such as Docker, ML Ops, and ML services is valuable, and experience with cloud platforms like Azure and AWS is a plus.
  • We value your excellent problem-solving abilities and your capacity to analyze complex issues and drive solutions with a data-driven approach.
  • Your strong verbal and written communication skills and success in cross-functional team environments will help us all succeed together.

Nice To Haves

  • Experience with generative image and video is a plus.

Responsibilities

  • You'll build and optimize infrastructures that power large foundation model training on thousands of GPUs
  • You will profile GPU utilization, trace inference and training runs and help craft strategies for optimizing our ML model latency.
  • We'll work together to architect and optimize end-to-end ML pipelines, ensuring they're scalable, efficient, and robust.
  • You'll dive deep into data to recommend the right models, evaluation metrics, and governance approaches.
  • Throughout the product lifecycle, you'll engage in architecture, design, deployment, and optimizations of ML models and systems.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service