Member of Technical Staff, LLM Evaluation

MicrosoftMountain View, CA
4d

About The Position

Leverage expertise to measure the performance of Copilot, identify failure modes and novel mitigation strategies, including data mining, prompt engineering, LLM as a judge, and classifier training. Creative problem solving, navigating complexity with clarity, independently shaping direction and delivering results even when the path isn't obvious. Create and implement comprehensive evaluation frameworks across diverse scenarios, edge cases, and potential failure modes. Build automated testing systems, generalize solutions into repeatable frameworks, and write efficient code for model pipelines and intervention systems. Maintain a user-oriented perspective by understanding needs from user perspectives, validating approaches through user research, and serving as a trusted advisor on AI matters Track advances in research, identify relevant state-of-the-art techniques, and adapt algorithms to drive innovation in production systems serving millions of users.

Requirements

  • Doctorate in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 5+ years data-science experience (e.g., managing structured and unstructured data, applying statistical techniques and reporting results) OR Master's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 7+ years data-science experience (e.g., managing structured and unstructured data, applying statistical techniques and reporting results) OR Bachelor's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 10+ years data science experience (e.g., managing structured and unstructured data, applying statistical techniques and reporting results)
  • Doctorate in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 8+ years data-science experience (e.g., managing structured and unstructured data, applying statistical techniques and reporting results) OR Master's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 10+ years data-science experience (e.g., managing structured and unstructured data, applying statistical techniques and reporting results) OR Bachelor's Degree in Data Science, Mathematics, Statistics, Econometrics, Economics, Operations Research, Computer Science, or related field AND 12+ years data science experience (e.g., managing structured and unstructured data, applying statistical techniques and reporting results) OR equivalent experience
  • Experience prompting and working with large language models
  • Experience writing production-quality Python code
  • Demonstrated interest in Responsible AI

Responsibilities

  • Measure the performance of Copilot
  • Identify failure modes and novel mitigation strategies, including data mining, prompt engineering, LLM as a judge, and classifier training
  • Creative problem solving, navigating complexity with clarity, independently shaping direction and delivering results even when the path isn't obvious
  • Create and implement comprehensive evaluation frameworks across diverse scenarios, edge cases, and potential failure modes
  • Build automated testing systems, generalize solutions into repeatable frameworks, and write efficient code for model pipelines and intervention systems
  • Maintain a user-oriented perspective by understanding needs from user perspectives, validating approaches through user research, and serving as a trusted advisor on AI matters
  • Track advances in research, identify relevant state-of-the-art techniques, and adapt algorithms to drive innovation in production systems serving millions of users

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

Ph.D. or professional degree

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service