Principal Software Engineer, CoreAI

MicrosoftRedmond, WA
16h

About The Position

Join Microsoft’s AI Core team building high performance runtime systems that serve OpenAI chat and multimodal AI models at scale. This role focuses on systems level optimization for largescale LLM inferencing with deep C++ expertise.

Requirements

  • 6+ years of experience in systems programming with strong expertise in C++.
  • Proven experience building, deploying, and operating scalable cloud services.
  • Strong debugging skills and experience using performance profiling and diagnostic tools.
  • Hands-on experience with distributed systems, Kubernetes, and containerized workloads.
  • Experience with largescale LLM inferencing infrastructure, including CUDA.
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Nice To Haves

  • Experience optimizing AI model inference distributed GPU/CPU stack.
  • Exposure to Azure OpenAI or similar largescale AI serving platforms.
  • Understanding of service reliability engineering (SRE) principles and operational excellence

Responsibilities

  • Join Microsoft’s AI Core team building high performance runtime systems that power OpenAI chat and multimodal AI models at scale. This role focuses on systems level optimization for largescale LLM inferencing with deep C++ expertise.
  • Design and implement high performance microservices and runtime components in C++.
  • Optimize AI inferencing systems for latency, throughput, cost, and reliability at large scale.
  • Debug and resolve complex production issues related to performance, scaling, and service reliability.
  • Collaborate with cross-functional partners to integrate model inference pipelines into scalable infrastructure.
  • Contribute to state-of-the-art multimodal inferencing systems supporting text, speech, and vision workloads.
  • Drive systems level innovations for realtime and batch inferencing efficiency.
  • Participate in code reviews and provide technical mentorship to senior and peer engineers.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service