Reinforcement Learning Engineer ($400k - $800k salary)

Baton Corporation LtdLondon, New York
Onsite

About The Position

Baton Corporation is seeking a Reinforcement Learning Engineer to own a production trading system that directly deploys real capital. This role focuses on building robust, measurable, and safe learning systems under real-world constraints, rather than pure research. The engineer will be responsible for increasing trading volume and user participation in a memecoin ecosystem through an RL-driven trading agent, designing reward functions and policies, building evaluation frameworks, and transitioning existing systems to learning-based approaches. This is a sole RL expert position with end-to-end ownership.

Requirements

  • Previously put an autonomous learning system into production that directly controlled capital, pricing, traffic, or resources and can explain what broke and how they fixed it
  • Personally designed and enforced hard risk limits (capital caps, loss bounds, circuit breakers) in a live system
  • Built a policy evaluation loop from scratch (simulators, replay, counterfactuals, shadow deployments) before trusting live rollout
  • Can make and defend uncomfortable tradeoffs (e.g. heuristic > RL, bandit > deep RL) based on empirical results instead of ideology
  • Operated as the single owner of a complex ML system in a small team, with no safety net of research orgs, infra teams, or “ML platforms.”

Responsibilities

  • Own and ship an RL-driven trading agent using real capital to increase trading volume and user participation in a memecoin ecosystem
  • Design reward functions and policies aligned with product goals while enforcing strict downside risk constraints
  • Build evaluation and validation frameworks (simulation, offline analysis) to minimize reliance on live sequential testing
  • Safely transition an existing heuristic-based production system toward learning-based approaches
  • Take end-to-end ownership and technical leadership as the sole RL expert, from data and modeling through deployment, monitoring, and safeguards

Benefits

  • Unmatched ownership and autonomy
  • Exposure to systems operating at the edge of crypto scale
  • The ability to ship fast and see real-world impact immediately
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service