About The Position

Join us in building the systems that enable Amazon’s AI models to learn from real-world customer behavior and continuously evolve at massive scale. We are seeking a Software Development Manager to lead the design and development of large-scale training data systems and experimentation frameworks that power Amazon’s next-generation shopping AI. This role leads a team responsible for building distributed systems that transform real customer interactions into measurable learning signals that continuously improve the Amazon shopping experience. As a Software Development Manager, you will define the technical vision and execution strategy for scalable systems that convert live traffic into structured datasets used across model training and evaluation workflows. You will lead engineers in building reliable, production-grade infrastructure while partnering closely with applied scientists to analyze model behavior, identify model quality gaps, and translate those insights into improved training data recipes and experimentation frameworks. This role operates at the intersection of ML infrastructure, data science, and production systems, with end-to-end ownership spanning data ingestion, training signal design, experimentation, and evaluation. You will drive the team’s roadmap, mentor engineers, and collaborate across teams to ensure Amazon’s AI systems can continuously learn from real customer behavior. You will also lead the integration of agent-driven systems that automate data curation, evaluation, and continuous model improvement, helping redefine how large-scale AI training workflows evolve in production.

Requirements

  • 5+ years of Software Engineer, Software Developer, or related occupational experience
  • 3+ years of engineering team management experience
  • 3+ years of providing technical leadership and project management for all aspects of the software development lifecycle experience
  • Knowledge of engineering practices and patterns for the full software/hardware/networks development life cycle, including coding standards, code reviews, source control management, build processes, testing, certification, and livesite operations
  • Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets

Nice To Haves

  • Experience with Machine Learning and Large Language Model fundamentals, including architecture, training/inference lifecycles, and optimization of model execution, or experience in machine learning, data mining, information retrieval, statistics or natural language processing
  • Experience using business metrics and training data to track, trend, and manage the impact of training efforts

Responsibilities

  • Lead a team of engineers to design and build scalable training data systems that enable AI models to continuously learn
  • Define the architecture and roadmap for high-throughput pipelines that transform customer behavior data to be used across model training and post-training workflows.
  • Partner with scientists to analyze model behavior, identify model quality gaps, and translate insights into improvements in training data coverage and learning signals.
  • Drive the design and evolution of training data recipes, including sampling strategies, signal weighting, filtering, and dataset composition to improve model performance.
  • Establish experimentation and measurement frameworks to validate training signal quality and ensure that data changes lead to measurable model improvements.
  • Ensure strong data governance, security, and compliance across large-scale production data workflows and ML training pipelines.
  • Provide technical leadership and architectural guidance for distributed systems and ML training infrastructure across multiple teams.
  • Lead the development of agent-driven workflows that automate data curation, evaluation, and continuous model improvement loops.
  • Hire, develop, and mentor engineers, fostering a culture of ownership, operational excellence, and high engineering standards.
  • Collaborate cross-team with engineering, science teams and partner organizations to accelerate AI model iteration and impact on the Amazon shopping experience.

Benefits

  • health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage)
  • 401(k) matching
  • paid time off
  • parental leave
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service