About The Position

Security represents the most critical priorities for our customers in a world awash in digital threats, regulatory scrutiny, and estate complexity. Microsoft Security aspires to make the world a safer place for all. We want to reshape security and empower every user, customer, and developer with a security cloud that protects them with end to end, simplified solutions. The Microsoft Security organization accelerates Microsoft’s mission and bold ambitions to ensure that our company and industry is securing digital technology platforms, devices, and clouds in our customers’ heterogeneous environments, as well as ensuring the security of our own internal estate. Our culture is centered on embracing a growth mindset, a theme of inspiring excellence, and encouraging teams and leaders to bring their best each day. In doing so, we create life-changing innovations that impact billions of lives around the world. The Microsoft AI Red Team is an interdisciplinary group of security experts, adversarial ML researchers, and software engineers with the mission of proactively identifying failures in Microsoft’s AI systems before they impact customers. Within the AI Red Team, our Tooling group builds platforms and developer experiences that enable teams across Microsoft to evaluate AI-powered systems at scale. Our team is the home of PyRIT (https://github.com/Azure/PyRIT), an open-source framework used for AI risk identification and evaluation. We are expanding this space by building a new platform that enables product teams to run system-level evaluations of agentic applications and model-enabled experiences inside normal engineering workflows. We are looking for a Software Engineer II to help build a new system-level evaluation platform for agentic applications and model-enabled experiences. These systems increasingly involve multi-step reasoning, tool and API use, multimodal inputs, and memory. In this role, you will contribute to the design and implementation of core platform capabilities and integration patterns that allow engineering teams to run evaluations in their workflows and pipelines. Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Requirements

  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
  • 1+ years of experience building systems with emphasis on reliability, durability, and operational efficiency, including experience with live site operations, incident response, and performance optimization.

Nice To Haves

  • Master's Degree in Computer Science or related technical field AND 3+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
  • OR Bachelor's Degree in Computer Science or related technical field AND 5+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
  • 1+ years of experience designing, building, and operating scalable, highly available cloud services or distributed systems on platforms such as Azure, AWS, GCP, or comparable cloud environments, with production ownership and CI/CD pipeline integration.

Responsibilities

  • Build core platform features that enable repeatable, scalable evaluations of agentic applications and model-enabled experiences (execution workflows, integration patterns, developer-facing interfaces).
  • Partner with product teams to gather requirements and deliver integrations that work across different system surfaces (chat experiences, APIs, and tool/orchestration layers) with minimal friction.
  • Ship high-quality, reliable code through strong engineering practices (design docs, code reviews, CI, unit/integration testing) to ensure correctness and reproducibility.
  • Make results actionable by producing clear, structured outputs and reporting artifacts (summaries, diagnostics, signals) that fit naturally into engineering workflows.
  • Serve as a Designated Responsible Individual (DRI) on-call for your area, monitoring for degradation or downtime, responding to incidents, and driving root-cause analysis and reliability improvements
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service