Leverage expertise to measure the performance of Copilot, identify failure modes and novel mitigation strategies, including data mining, prompt engineering, LLM as a judge, and classifier training. Creative problem solving, navigating complexity with clarity, independently shaping direction and delivering results even when the path isn't obvious. Create and implement comprehensive evaluation frameworks across diverse scenarios, edge cases, and potential failure modes. Build automated testing systems, generalize solutions into repeatable frameworks, and write efficient code for model pipelines and intervention systems. Maintain a user-oriented perspective by understanding needs from user perspectives, validating approaches through user research, and serving as a trusted advisor on AI matters Track advances in research, identify relevant state-of-the-art techniques, and adapt algorithms to drive innovation in production systems serving millions of users.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
Ph.D. or professional degree
Number of Employees
5,001-10,000 employees