Principal Research Engineer, Model Training & Post-Training

Inflection AIPalo Alto, CA
$400,000 - $550,000

About The Position

Inflection’s models are central to our product and platform strategy, and we are looking for a hands-on technical leader to own the model-improvement loop from data and training through evals, post-training, release criteria, and production feedback. This person will sit at the intersection of research, production engineering, and model release, with a mandate to ship models that are measurably better for users. The ideal candidate has led serious model training or post-training work before, can make principled tradeoffs across data, compute, architecture, and quality, around a clear technical roadmap.

Requirements

  • Experience leading, or serving as a principal contributor to, large-scale LLM, multimodal, or foundation-model training or post-training programs.
  • Deep experience with transformer-based models, hybrid architectures, modern deep-learning frameworks, and distributed training systems.
  • Strong practical experience with post-training and alignment methods such as SFT, RLHF, DPO, GRPO, RLAIF, reward modeling, preference optimization, tool-use fine-tuning, or related approaches.
  • Experience operating or partnering on large-scale training infrastructure, ideally including GPU clusters at the scale of 1,000+ GPUs.
  • Strong systems instincts around throughput, cost, reliability, observability, debugging, checkpointing, reproducibility, and fault tolerance.
  • Excellent judgment around data quality, evaluation design, model regressions, release readiness, and production model behavior.
  • Ability to balance research ambition with product pragmatism, user impact, and operational discipline.
  • Experience leading senior technical teams while continuing to contribute directly to technical decisions and implementation.
  • PhD in Computer Science, Machine Learning, Artificial Intelligence, or a related field, or equivalent practical experience.

Responsibilities

  • Own the model-improvement roadmap across capability, reliability, emotional intelligence, tool use, safety, latency, cost, and enterprise readiness.
  • Lead training and post-training strategy, including supervised fine-tuning, RLHF, DPO, GRPO, RLAIF, reward modeling, preference optimization, tool-use fine-tuning, distillation, synthetic data, and related methods.
  • Drive model architecture and optimization decisions across modern transformer-based and hybrid architectures, including both training-time and inference-time performance.
  • Lead large-scale training efforts on distributed GPU clusters, including systems operating at the scale of 1,000+ GPUs.
  • Define and execute data strategy across data curation, mixture design, deduplication, decontamination, human-in-the-loop pipelines, preference data, evaluation data, synthetic data, and production feedback loops.
  • Build and improve evaluation and release-quality systems, including model evals, quality gates, regression detection, release criteria, model-readiness reviews, and post-release monitoring.
  • Partner closely with infrastructure and research engineering teams to improve distributed training reliability, checkpointing, fault tolerance, observability, reproducibility, and cost-performance tradeoffs.
  • Debug and improve model behavior across the full stack: data, training, post-training, evaluation, infrastructure, product integration, and production feedback.

Benefits

  • Diverse medical, dental and vision options
  • 401k matching program
  • Unlimited paid time off
  • Parental leave and flexibility for all parents and caregivers
  • Support of country-specific visa needs for international employees living in the Bay Area
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service