Principal Applied Scientist

Microsoft•Redmond, WA

17h

About The Position

As a Principal Applied Scientist on the Copilot Agentic Security team, you will serve as a scientific and technical leader advancing the foundations that secure Microsoft’s agentic and autonomous AI systems. You will define what to measure, how to measure it, and how to validate mitigations at scale, shaping how Copilot evaluates and reduces real‑world security risk. You will operate across the full security lifecycle—attack simulation → threat pattern discovery → mitigation design → empirical validation → continuous measurement—with ownership of end‑to-end evaluation strategies rather than isolated experiments. Your work informs durable, reusable defenses that integrate into Copilot’s orchestration layers, shared services, and emerging agents. At IC6, success is not only producing strong scientific artifacts, but setting scientific direction, influencing engineering and security strategy, and ensuring that mitigation decisions are grounded in objective evidence, robust metrics, and reproducible methodology. Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Requirements

Bachelor's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 6+ years related experience (e.g., statistics, predictive analytics, research)
OR Master's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 4+ years related experience (e.g., statistics, predictive analytics, research)
OR Doctorate in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 3+ years related experience (e.g., statistics, predictive analytics, research)
OR equivalent experience.
Experience designing and executing scientific experiments for complex, real‑world systems, including defining metrics, evaluation methodology, and validation criteria.
Experience building or applying ML models or statistical techniques to analyze system behavior, detect failure modes, or evaluate mitigation effectiveness.
Proficiency in one or more programming languages such as Python, C++, C#, or Java, including building reproducible analysis or evaluation pipelines.
Experience designing or evaluating algorithms, metrics, datasets, or data pipelines used to assess system quality, safety, or security.
Ability to meet Microsoft, customer, and/or government security screening requirements.
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings:
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Nice To Haves

Master's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 9+ years related experience (e.g., statistics, predictive analytics, research)
OR Doctorate in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND 6+ years related experience (e.g., statistics, predictive analytics, research)
OR equivalent experience.
5+ years experience creating publications (e.g., patents, libraries, peer-reviewed academic papers).
2+ years experience presenting at conferences or other events in the outside research/industry community as an invited speaker.
5+ years experience conducting research as part of a research program (in academic or industry settings).
3+ years experience developing and deploying live production systems, as part of a product team.
3+ years experience developing and deploying products or systems at multiple points in the product cycle from ideation to shipping.
Ability to operate effectively in highly ambiguous, fast‑evolving security environments, particularly those involving autonomous or generative AI systems.
Willingness to engage deeply across organizational boundaries to drive durable security outcomes.
Work‑site requirements vary by location and follow organizational guidance.
This role may involve working with sensitive or confidential AI model data in accordance with Microsoft Responsible AI and Security policies.

Responsibilities

Lead the design of scientific approaches for evaluating agentic security risks, including defining metrics, experimental methodology, and validation criteria.
Serve as a scientific authority for how Copilot measures security quality, mitigation effectiveness, and residual risk over time.
Identify gaps in existing evaluation approaches and invent new methodologies where current techniques are insufficient.
Develop and apply advanced techniques in machine learning, generative modeling, and empirical evaluation to identify and characterize risks in agentic and autonomous systems.
Build reproducible experiments, datasets, and scorecards that quantify system vulnerabilities and mitigation performance across realistic threat scenarios.
Investigate model behaviors, failure modes, and threat classes emerging from MSRC cases, red‑team exercises, and internal adversarial testing.
Design and operate end‑to-end evaluation pipelines that measure agentic security quality across Copilot orchestration components and shared services.
Translate top security risks into measurable scientific requirements, evaluation rubrics, and success criteria aligned with Copilot’s protection lifecycle.
Use telemetry and real‑world signals to continuously tune scoring functions, alerting thresholds, and validation loops.
Partner with engineering and adversarial testing to close the loop from threat discovery → mitigation design → empirical validation.
Prototype and validate mitigation approaches, balancing security effectiveness, model behavior, system performance, COGS, and latency.
Analyze incidents and vulnerability reports to generalize categorical mitigations that eliminate classes of threats rather than single exploits.
Work closely with engineering to ensure scientific outputs are operationalized into production‑grade services and reusable platform components.
Collaborate with PM/TPM and partner security teams (e.g., MSRC, Microsoft AI Security, Azure Security) to align scientific insights with roadmap decisions.
Communicate findings, tradeoffs, and impact clearly to senior engineering leadership and stakeholders.