Reinforcement Learning Engineer, Grasping

Persona AI Inc•Houston, TX

About The Position

Persona AI is developing and commercializing rugged, multi-purpose humanoid robots that perform real work. Our mission is focused squarely on shipping beautiful, reliable products at massive scale, while building a customer-focused team to achieve these aims. We are looking for a Reinforcement Learning Engineer to join our Manipulation team, focused on dexterous grasping. Our goal is to ship capable, reliable grasping policies on real hardware with high-DOF robotic hands. We are looking for someone who can follow recent advances in reinforcement learning and related learning-based methods, judge what is practically useful, and adapt those ideas on our platform. If you are earlier in your career but exceptional, we want to hear from you; equally, a more experienced candidate who brings deep RL expertise will thrive here.

Requirements

BS, MS, or PhD in Robotics, Computer Science, Machine Learning, or a related field.
2+ years of hands-on experience in reinforcement learning for robotic manipulation; exceptional recent graduates from relevant research labs will be considered.
Demonstrated ability to read, understand, and implement ideas from recent robotics and machine learning research.
Hands-on experience training RL agents for robotic manipulation tasks, including reward shaping and policy evaluation.
Experience with sim-to-real transfer: domain randomization, physics tuning, or real-world policy validation on hardware.
Proficiency in Python and deep learning frameworks (PyTorch, JAX), along with RL libraries such as rsl_rl or skrl.
Experience preparing meshes and collision geometries for RL environments in simulators such as MuJoCo and/or Isaac Sim.

Nice To Haves

Experience deploying RL-trained policies on physical robotic hands.
Experience with tactile sensors and integrating tactile feedback into learned grasp policies.
Experience with contact-rich manipulation and force/torque estimation.
Familiarity with other learning-based approaches such as behavior cloning, imitation learning, or diffusion-based policy methods.
Publications or project work at top-tier venues (CoRL, RSS, ICRA) on grasping or dexterous manipulation.
Experience in a humanoid robot startup environment.

Responsibilities

Train and iterate on reinforcement learning policies for complex grasping tasks including functional grasping, tool use, in-hand manipulation, and environment interaction.
Implement and refine sim-to-real transfer pipelines to bridge the gap between simulation and physical robotic hand performance.
Develop reward functions, curriculum strategies, and training environments in MuJoCo and Isaac Lab.
Run experiments on real robots alongside simulation, evaluating and debugging policy behavior on hardware.
Monitor, evaluate, and adapt state-of-the-art research in learning-based grasping to deploy on our humanoid platform.
Collaborate with the rest of the software team to deploy end-to-end grasping systems.
Benchmark and evaluate grasp policies across object diversity, clutter scenes, and real-world uncertainties.
Integrate tactile sensing and feedback into grasp policies for robust, force-aware manipulation.