Senior ML Engineer, Fauna

AmazonNew York, NY

About The Position

We are seeking a Senior ML Engineer to build and scale the machine learning systems that power our intelligent robots. In this role, you will design and maintain the infrastructure for training, evaluating, and deploying the ML models that enable robot locomotion, perception, manipulation, navigation, and human-robot interaction. You'll work at the intersection of machine learning and systems engineering, ensuring our ML training and deployment systems are robust, efficient, and scalable as we grow from prototype to production.

Requirements

  • 5+ years of non-internship professional software development experience
  • 5+ years of programming with at least one software programming language experience
  • 5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • Experience as a mentor, tech lead or leading an engineering team
  • Bachelor's degree or above in computer science, machine learning, engineering, or related fields, or Master's degree
  • Experience with Machine Learning and Large Language Model fundamentals, including architecture, training/inference lifecycles, and optimization of model execution, or experience in development in the last 3 years
  • Experience with machine learning (ML) tools and methods
  • Experience in Kubernetes, Docker or containers ecosystem, or experience that includes strong analytical skills, attention to detail, and effective communication abilities and experience with programming/scripting (Batch, VB, PowerShell, Java, C#, Chef, Perl, Ruby and/or PHP)

Nice To Haves

  • Experience building and operating a cloud-based architecture
  • Experience with robotics data (sensor streams, video, point clouds) and real-time inference systems
  • Familiarity with model optimization techniques (quantization, pruning, distillation)
  • Experience with reinforcement learning or simulation-based training pipelines

Responsibilities

  • Design and build scalable ML training infrastructure, including distributed training pipelines and GPU cluster management both in the cloud and on-prem
  • Develop systems for experiment tracking, model versioning, and reproducibility
  • Build deployment infrastructure for serving ML models on robotic hardware with strict latency requirements
  • Optimize model inference for edge devices and embedded systems
  • Collaborate with research teams to accelerate the path from experimentation to production
  • Contribute to data pipelines and labeling infrastructure as needed, in partnership with the data platform team

Benefits

  • health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage)
  • 401(k) matching
  • paid time off
  • parental leave
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service