About The Position

Unity Vector builds an offline ML platform that powers insight, experimentation, attribution, and AI-driven decision-making across the company. Our systems operate at scale across batch and streaming data, supporting analytics, product intelligence, machine learning pipelines, and business operations. As data volume and complexity grow, our platform enables large-scale model training, feature generation, and experimentation workflows that power production ML systems. We’re looking for a Machine Learning Engineer to join our Offline Infrastructure team. This is an ideal role for a recent PhD graduate who is excited to work on large-scale systems and apply research-driven thinking to real-world machine learning problems. You’ll help build and evolve the infrastructure that powers training data generation, ML workflows, and distributed model training. Working closely with experienced engineers and researchers, you’ll contribute to systems that ensure our ML pipelines are reliable, scalable, and efficient. This role offers the opportunity to bridge research and production—translating advanced ideas into systems that operate at scale.

Requirements

  • PhD in Computer Science, Machine Learning, Systems, or a related field
  • Strong foundation in machine learning systems, distributed systems, or large-scale data processing (through research or projects)
  • Experience with Python and working with data-intensive workloads
  • Familiarity with ML frameworks (e.g., PyTorch, TensorFlow) and/or distributed systems (e.g., Ray, Spark)
  • Experience (academic or applied) with data pipelines, model training workflows, or large datasets
  • Strong problem-solving skills and ability to translate research ideas into practical systems
  • Interest in building scalable, reliable infrastructure for machine learning

Nice To Haves

  • Experience with workflow orchestration systems (Airflow, Flyte, etc.)
  • Exposure to large-scale data platforms (data lakes, warehouses, streaming systems)
  • Publications or research in ML systems, distributed systems, or related areas

Responsibilities

  • Build and maintain data pipelines that generate training datasets for machine learning models and experimentation
  • Contribute to infrastructure that supports distributed training workflows (e.g., PyTorch, Ray)
  • Work with workflow orchestration tools (e.g., Airflow, Flyte, or similar) to support multi-stage ML pipelines
  • Improve reproducibility and reliability through dataset validation, monitoring, and testing
  • Partner with ML engineers to support experimentation and model iteration
  • Help optimize performance and efficiency across data processing and training systems
  • Contribute to the evolution of our offline ML platform architecture as it scales

Benefits

  • Comprehensive health, life, and disability insurance
  • Commute subsidy
  • Employee stock ownership
  • Competitive retirement/pension plans
  • Generous vacation and personal days
  • Support for new parents through leave and family-care programs
  • Office food snacks
  • Mental Health and Wellbeing programs and support
  • Employee Resource Groups
  • Global Employee Assistance Program
  • Training and development programs
  • Volunteering and donation matching program
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service