Cloud Network Engineer II

Microsoft

118d

About The Position

The High Performance Computing and Artificial Intelligence (HPC and AI) team is focused on building the next-generation distributed artificial intelligence supercomputer. Our goal is to enable breakthroughs in artificial intelligence by delivering unmatched computational power, scalability, and reliability. We design and develop advanced infrastructure that supports high-performance model training at scale, laying the groundwork for innovations that expand the boundaries of what artificial intelligence can achieve. We are seeking a Cloud Network Engineer II who is passionate about designing and developing the infrastructure that powers large-scale artificial intelligence and high-performance computing systems. In this role, you will contribute to the design, deployment, and operation of network infrastructure, automation workflows, observability frameworks, and performance optimization systems. These components are essential for achieving ultra-low latency, high throughput, and efficient data movement at petabyte scale in distributed workloads. As a Cloud Network Engineer II on the HPC and AI Infrastructure team, you will work at the intersection of artificial intelligence supercomputing and large-scale networking. Your contributions will directly impact the reliability and performance of distributed clusters, leveraging high-speed fabrics such as Ethernet and InfiniBand, and accelerated compute platforms including NVIDIA and AMD graphics processing units. This is a unique opportunity to help build the network infrastructure that ensures speed, reliability, and high availability at exascale levels, while collaborating across hardware, infrastructure, and platform teams.

Requirements

Experience in designing and developing network infrastructure.
Knowledge of automation workflows and observability frameworks.
Familiarity with performance optimization systems.
Experience with high-speed fabrics such as Ethernet and InfiniBand.
Knowledge of accelerated compute platforms including NVIDIA and AMD GPUs.

Responsibilities

Design, deploy, and operate network infrastructure for large-scale AI and HPC systems.
Develop automation workflows and observability frameworks.
Optimize performance systems for ultra-low latency and high throughput.
Ensure efficient data movement at petabyte scale in distributed workloads.
Collaborate with hardware, infrastructure, and platform teams.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Career Level

Mid Level

Number of Employees

5,001-10,000 employees

Cloud Network Engineer II

About The Position

Requirements

Responsibilities

What This Job Offers

Job Search Resources

Tools

Career Hubs

Guides

Company