Senior Software Engineer - CoreAI Model Inference & Serving

Microsoft•Redmond, WA

About The Position

Join our team within CoreAI, where we are building the AI data-plane that powers all LLM inferencing workloads across Microsoft and Azure customers—from cutting-edge startups to Fortune 500 enterprises. Our converged AI fabric delivers inference capabilities for all LLMs in Microsoft catalog, including OpenAI, Anthropic, Mistral, Cohere, Llama, and more. As a Senior Software Engineer, you will shape the future of one of the largest and fastest-growing services in Azure, foundational to Microsoft’s AI strategy. Our mission is to serve models at scale—reliably, efficiently, and with ultra-low latency—enabling a rich set of AI-powered product experiences. This is a rapidly evolving space with immense opportunities to learn, innovate, and drive industry-wide impact!

Requirements

Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, or Java OR equivalent experience.
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role.
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Nice To Haves

4+ years of design and problem-solving experience, with understanding of system performance, scalability, and engineering best practices.
Understanding of distributed systems specifically in request serving at scale; (e.g. inferencing, L7 gateways, high-performance storage, distributed databases across global-scale infrastructure)
Demonstrated experience in building high-quality, reliable systems at scale.
Experience using modern AI-assisted development tools and workflows to move faster, improve quality, and amplify engineering impact.
Customer-obsessed approach to problem solving, with empathy and a drive to deliver impactful solutions.

Responsibilities

Be a hands-on technical leader, designing, coding, and shipping core serving systems, smart routing, and request distribution for a broad portfolio of LLMs, including OpenAI, Mistral, Grok, DeepSeek, and others.
Build large-scale AI services and platform capabilities that power new products and customer experiences.
Drive cutting-edge innovation in AI systems alongside world-class engineers and cross-functional partners.
Lead through architecture, code reviews, mentorship, and technical excellence while staying close to implementation.
Improve reliability, scalability, observability, efficiency, and performance across mission-critical services.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume