AI Researcher - Reinforcement Learning

1XSan Carlos, CA
$200,000 - $300,000Onsite

About The Position

1X is building humanoid robots designed to perform household chores and tasks, aiming to give people more free time. This involves solving complex challenges in robotics, AI, and manufacturing simultaneously, at scale, and within a safe, family-friendly form factor. The company has been developing these robots since 2014 and is now focused on shipping its flagship product, NEO, a home robot designed to move, learn, and operate alongside people in real-world environments. The company is seeking individuals inspired by this mission who want to contribute to a product that will genuinely change how humans spend their time by creating abundance safely for all. The Reinforcement Learning team is responsible for teaching NEO new capabilities by training policies for manipulation and locomotion tasks. This involves working across simulation and real-world environments, with a focus on the intersection of algorithm development, sim-to-real transfer, and production deployment. The team's success is measured by the reliable performance of trained policies on physical robots in homes. The role involves owning the entire pipeline from RL algorithm development to production deployment, including training NEO on manipulation and locomotion tasks in simulation, bridging the sim-to-real gap, and shipping reliable policies for real-world home environments. This is critical-path work, as the robot's capabilities are directly dependent on the quality of the RL policies developed. The role requires close collaboration with hardware, controls, data collection, and QA teams, with impact measured by the robot's performance in the field.

Requirements

  • Strong Python and/or C++ with experience in large codebases and build tools (Bazel or equivalent)
  • Proficiency with PyTorch for RL policy training and experimentation
  • Hands-on experience with simulation platforms (Isaac Sim, MuJoCo, or equivalent) for policy training at scale
  • Demonstrated experience training RL policies for manipulation or locomotion tasks, including addressing the sim-to-real gap on physical hardware
  • Sim-to-real practitioner closing the sim-to-real gap on physical systems; understands domain randomization, reward shaping, and the engineering required to make simulated policies transfer reliably to real hardware
  • RL algorithms depth with strong foundation in RL algorithms (PPO, SAC, TD-MPC, or similar); can choose the right approach for the task and modify or extend it when standard methods fall short
  • Full-stack ownership owning data engineering, model architecture, and deployment; treats a promising training curve as the beginning of the job, not the end
  • Effective cross-functional partner working closely with hardware, controls, QA, and data teams to translate RL research into deployed robot skills, and communicates technical constraints clearly across disciplines

Nice To Haves

  • Experience with model-based RL or world-model-guided policy learning that leverages predictive models to improve sample efficiency
  • Familiarity with imitation learning or learning from demonstration (behavior cloning, GAIL, IQL) as a complement or bootstrap to RL
  • Experience deploying RL-trained policies to physical robots in production environments, including monitoring, failure analysis, and iterative improvement
  • Background in legged locomotion, dexterous manipulation, or contact-rich control for physical systems

Responsibilities

  • Train and deploy RL policies for manipulation and locomotion tasks that perform reliably in real-world home environments measured by field task success rates, not just simulation benchmarks
  • Advance sim-to-real transfer techniques that measurably narrow the gap between simulation training performance and real-world policy behavior, enabling faster iteration cycles
  • Build training and evaluation infrastructure that lets the team iterate on policies faster with standardized benchmarks, automated regression detection, and clear connections between training metrics and field performance
  • Partner with hardware, controls, data, and QA teams to ship RL-trained skills to production customer sites, owning the handoff from research to deployment

Benefits

  • Comprehensive medical, dental, and vision coverage
  • Generous paid time off, company holidays, and parental leave
  • 401(k) plan with company match (100% on the first 3% of contributions, 50% on the next 2%)
  • Flexible Spending Accounts (FSA) and Health Savings Accounts (HSA) options
  • Commuter benefits (transit and parking)
  • Short-term and long-term disability, and life insurance
  • Employee Assistance Program (EAP) for mental health, financial, and personal support
  • Onsite snacks and catered lunches
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service