About The Position

Research Program Managers at Reflection are high-leverage leaders and operators who embed directly with research and infrastructure teams to accelerate the pace of frontier model development. They are not project trackers. They are force multipliers who bring clarity to ambiguity, drive decisions when the path forward is unclear, and ensure that the work happening across multiple teams connects into a coherent whole. This role focuses on scaling our research infrastructure to support massive, frontier-scale training runs across pre-training, mid-training, and post-training. You will work closely with teams building on training libraries like Megatron, driving the programs that turn raw clusters into reliable, high-performance training environments. Your job is to make sure the infrastructure we build works end-to-end, that teams are unblocked, and that we can scale with confidence as our ambitions grow. You bring a first-responder mentality. When things go sideways, you don't wait to be asked. You jump in, assess the situation, cut through noise, align the people who need to be aligned, and drive resolution.

Requirements

  • 7+ years of experience in technical program management, research operations, or infrastructure coordination, ideally in ML/AI or large-scale distributed systems environments.
  • Deep technical knowledge to engage with engineers on topics like distributed training frameworks, GPU cluster architecture, scheduler behavior, networking, and storage systems. You don't need to write the code, but you need to understand the systems to “speak the language”, i.e., to ask the right questions and identify risks early.
  • Proven ability to operate effectively in high-ambiguity, fast-moving environments. You create structure where there is none and drive clarity without waiting for permission.
  • Track record of managing complex, multi-team programs with competing priorities and hard deadlines. You know how to make tradeoffs and you communicate them clearly.
  • Strong stakeholder management skills across both deeply technical ICs and senior leadership. You build trust by being reliable, direct, and well-informed.
  • Comfortable operating in crisis mode. You stay calm under pressure, you know how to prioritize when everything is on fire, and you follow through on the other side.
  • Excited to build from zero to one. We are a small, fast-moving team and this role will help define how Research Program management Works at Reflection.
  • Motivated by enabling researchers and engineers to build the world's most capable open-weight AI systems.

Responsibilities

  • Own cross-functional programs spanning training infrastructure and cluster reliability across pre-training, mid-training, and post-training workstreams.
  • Drive end-to-end coordination scaling our training stack alongside engineering leads and external partners.
  • Jump into active incidents and escalations to triage, coordinate response, and drive resolution across teams. Champion a culture of blameless post-mortems and continuous learning, turning every incident into a concrete improvement to our systems and processes.
  • Partner with infrastructure and research engineering leads to identify bottlenecks, define priorities, and ensure that infrastructure investments are directly tied to research velocity.
  • Build and maintain visibility into training run health, cluster reliability, and infrastructure performance so that leadership and teams have the context they need to make fast, informed decisions.
  • Create lightweight, durable processes for cross-team handoffs, config management, checkpoint workflows, and other coordination-heavy touchpoints that currently rely on ad hoc communication.
  • Translate technical complexity into clear status updates and decision frameworks for engineering leadership and executives.

Benefits

  • Comprehensive medical, dental, vision, life, and disability insurance.
  • Fully paid parental leave for all new parents, including adoptive and surrogate journeys.
  • Financial support for family planning.
  • Paid time off when you need it.
  • Relocation support.
  • Lunch and dinner are provided daily.
  • Regular off-sites and team celebrations.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Senior

Education Level

No Education Listed

Number of Employees

1-10 employees

© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service