In this role, you will contribute to building and operating AI-powered middle-tier services that support conversational experiences within widely used productivity applications. You will focus on prompt evaluation, testing, and automation, ensuring that AI responses are accurate, reliable, and aligned with business and user expectations. You will work closely with engineering, product, and data partners to evaluate LLM behavior, design test strategies, implement supporting code, and continuously improve prompt quality and system performance.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Number of Employees
501-1,000 employees