Software Engineering Manager

Microsoft•Redmond, WA

29d

About The Position

OneDrive and SharePoint (ODSP) power the world’s most impactful intranets, collaboration experiences, business workflows, and content ecosystems. As AI becomes deeply embedded across these surfaces—from search, Q&A, and summarization to powerful synchronous and autonomous agents—our ability to measure quality, reliability, and safety at scale becomes a strategic advantage. Evaluation, both offline and online, is now the way we build and ship AI. As the Engineering Manager for the Eval Tooling team, you will lead the group responsible for transforming how we test, measure, and improve AI quality across ODSP Experiences. Your mission starts with elevating developer productivity and enabling fast, confident iteration across a broad and rapidly expanding set of AI workloads: RAG, agents, content generation, semantic search, content understanding, and ODSP’s emerging agents that orchestrate multi‑step actions across files, lists, and sites. You will also partner with Applied Science and Customer Success teams to scale customer data sets. You will partner closely with evaluation platform and tooling efforts across M365 to both leverage shared capabilities and contribute back to the broader ecosystem—we are One Microsoft. This is a hands‑on technical and strategic role where you will define how ODSP Experiences builds trust in AI and ships AI safely, quickly, and confidently. Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Requirements

Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
2+ years of engineering management experience, with a track record of leading and developing high-performing teams.
2+ years of experience on engineering tooling or eval development
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Nice To Haves

Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
4+ years people management experience.
Proven track record of hands-on technical leadership and people management in high-scale, cloud-based environments.
Deep understanding of AI/ML concepts and practical experience applying AI to real-world product scenarios highly preferred.
Strong architectural skills in building scalable, distributed systems and cloud services.
Track record of rapid iteration, experimentation, and continuous learning.
Excellent communication, collaboration, and stakeholder management skills.

Responsibilities

Lead, mentor, and develop a high‑performing engineering team building evaluation tooling for AI scenarios across ODSP Experiences.
Define and drive the technical strategy for offline and online evaluation, including scenario‑based frameworks, dataset pipelines, LLM auto‑raters, metrics, and dashboards.
Partner closely with ODSP Core Eval Platform and M365‑wide tooling teams to leverage shared infrastructure, influence platform roadmaps, and align on quality bars.
Enable model agility and safe shipping through automated quality gates, regression detection, telemetry instrumentation, and reliable online metrics.
Collaborate deeply with AI feature teams across ODSP Experiences to embed evaluation into development workflows.
Foster a culture of measurement rigor, engineering excellence, and inclusive growth, balancing fast iteration with trust, safety, and customer value.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume