About The Position

Amazon’s Frontier AI & Robotics (FAR) team is seeking a Member of Technical Staff, Infrastructure to build and scale the foundational systems that power our robotics research and development platform. In this role, you will design and operate the distributed infrastructure that enables our researchers and engineers to train foundation models, run large-scale experiments, and deploy intelligent robotic systems at Amazon scale. Join the next revolution in robotics, where you’ll work alongside world-renowned AI pioneers to push the boundaries of what’s possible in robotic intelligence. As a Member of Technical Staff focused on Infrastructure, you’ll build the critical platform layer that accelerates every aspect of FAR’s research — from high-throughput data pipelines and experiment management systems to low-latency model serving and configuration delivery for robotic deployments. This role is deeply technical and focuses on performance, scalability, and reliability at scale. You will design systems that support volumes of training data, operate with strict latency requirements, and provide the compute and data foundation that enables breakthrough research across FAR’s robotics ecosystem.

Requirements

  • 5+ years of distributed systems experience
  • Bachelor's degree in Computer Science or a related field
  • Proficiency in Python and at least one systems or backend programming language (e.g., Go, Java, C++)
  • Experience with cloud infrastructure platforms (AWS, GCP, or Azure), including compute, storage, and networking services
  • Experience building or maintaining data pipelines, ETL systems, or ML training/serving infrastructure
  • Understanding of system reliability principles including monitoring, observability, fault tolerance, and on-call operational practices

Nice To Haves

  • Experience supporting AI/ML research workflows, including building and optimizing training stack, experiment tracking, dataset management, or model deployment infrastructure
  • Familiarity with robotics platforms, simulation environments, or real-time systems with strict latency requirements
  • Experience with large-scale data processing frameworks (e.g., Apache Spark, Flink, or Ray) and query optimization for analytics workloads
  • Demonstrated ability to lead large technical initiatives and influence architectural decisions across cross-functional teams
  • Experience building developer tooling, internal platforms, or self-service infrastructure systems that improve research or engineering productivity

Responsibilities

  • Design and build scalable compute and data infrastructure to support model training, inferencing, and eval for frontier AI/Robotics development
  • Lead large technical initiatives and shape the architecture of FAR’s research platform infrastructure
  • Develop tooling and frameworks that accelerate research workflows, including dataset management, visualization, and quality assessment systems
  • Optimize query performance and data availability for experimentation and analytics workflows used by research teams
  • Improve the performance, efficiency, and reliability of FAR’s core compute and storage infrastructure, ensuring systems remain fast and stable at scales
  • Build highly scalable experimentation and analytics infrastructure to support model evaluation, A/B testing, and feature performance
  • Collaborate directly with science and robotics teams to support research projects through both infrastructure development and hands-on technical contribution

Benefits

  • equity
  • sign-on payments
  • a full range of medical, financial, and/or other benefits
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service