Most AI systems work in demos. Very few hold up in real customer environments. This team is building the decision-making systems behind AI agents that operate across voice, chat, and email — where performance is measured in outcomes, not benchmarks. You’ll work on models that need to reason over time, handle multi-step workflows, and stay consistent across entire interactions. Not just once, but repeatedly, under real-world constraints. This is applied research that ships. You’ll take ideas from early concept through to production, owning how systems behave when deployed at scale. The challenge is not just capability. It’s reliability — making reasoning systems that can operate across long-context interactions, manage memory, use tools, and execute workflows without breaking down. You’ll be working closely with product and engineering teams, iterating on real-world failures, and improving systems based on how they actually perform in production.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Senior
Education Level
No Education Listed