About The Position

Cloud computing is a highly competitive and rapidly growing market and is one of the most important initiatives for Microsoft. Customers put their big bet on Azure Cloud Platform to run their business. The Azure Core Insights team is a growing Agile team seeking a passionate candidate who possesses both machine learning and data science expertise, along with development skills, and a strong interest in building Artificial Intelligence for IT Operations (AIOps) solutions to address the unique challenges in cloud environments and drive the next generation of cloud infrastructure. As a member of our team, you will help design and implement anomaly detection, automatic triaging and correlation, and causal inference models to deliver preventive insights that improve the availability, reliability, and efficiency of the Azure cloud system. These efforts will be based on statistics, artificial intelligence and machine learning (AI/ML), large language models (LLM), and artificial intelligence agents (AI Agent). You will help drive actions based on these insights and integrate AI/ML, LLM, and AI Agent technologies into our daily operations by collaborating with the research team, engineering team, and program management team. The ideal candidate should be passionate about computer science, AI/ML, LLM, and AI Agent technologies, and about transforming data into meaningful insights that lead to actionable outcomes. If you’ve dreamed of having a global impact and love working with data and AI to create substantial business value, we’d love to talk to you! Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Requirements

  • Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python, OR equivalent experience.
  • 4+ years in anomaly detection algorithm design or implementations experience.
  • 2+ years of familiarity with open source machine learning library such as Scikit-Learn, Pandas, Seaborn, and/or similar.
  • 2+ years of experience with AI Agent framework and machine learning models or LLM models such as linear/nonlinear regression, Bagged and Boosted Trees, Bayes methods, Transformer, and/or similar.
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.

Nice To Haves

  • Bachelor's Degree in Computer Science OR related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, OR Python OR Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
  • 4+ years of basic compute science knowledge (Concepts such as Central Processing Unit (CPU), memory, Top of Rack (ToR) switch, load balancer (LB), virtual network (VNet), and virtual local area network (VLAN), etc.), and serverless architectures and other cloud architectural patterns.
  • 4+ years of expertise in Prompt Engineering in AI Agent Framework, AI Agent deployment, large language model and GPT.
  • 4+ years of proficiency in data visualization tools such as PowerBI, Networkx or similar.

Responsibilities

  • Share accountability of a wide array of assets and be comfortable with learning a broad array of technologies.
  • Independently design and implement anomaly detection, auto-triaging/correlation, and causal inference model to deliver preventive insights to improve Azure cloud system availability, reliability, and efficiency.
  • Work with partner teams to integrate the Insights into Azure daily dev operations and Azure system for automatic mitigation and repairs.
  • Contribute towards driving visibility into customer impacting on Virtual Machines or Containers or higher-level Azure services built on top of Virtual Machines.
  • Assist with building an automated data quality solution to detect problems in downstream dependencies and take automated action to correct them.
  • Look for opportunities to share learnings and tools broadly within Microsoft and beyond. Specifically, our team does cutting-edge work with Azure Data Explorer (Kusto) and makes a point at contributing back to the larger environment.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service