Sr. Staff Software Engineer - HPC Network Engineering

LinkedInMountain View, CA
6d$181,000 - $297,000Hybrid

About The Position

At LinkedIn, our approach to flexible work is centered on trust and optimized for culture, connection, clarity, and the evolving needs of our business. The work location of this role is hybrid, meaning it will be performed both from home and from a LinkedIn office on select days, as determined by the business needs of the team. This role will be based in Mountain View, CA. We are seeking an HPC Network Engineer to design, deploy, and operate high-performance, low-latency Ethernet fabrics for large-scale GPU clusters. The role focuses on RoCE v2–based GPU interconnect networks supporting AI/ML training, inference, and HPC workloads. You will work closely with systems, GPU, platform, and software teams to build scalable, lossless Ethernet networks optimized for RDMA traffic. As a Senior Staff Software Engineer, you will define long-term technical direction, lead cross-org initiatives, mentor senior engineers, and drive solutions for complex distributed systems challenges at massive scale. This role requires deep expertise in backend systems, data processing, and large-scale system design, with strong understanding of networking concepts.

Requirements

  • BA/BS Degree in Computer Science or related technical discipline, or equivalent practical experience
  • 10+ years of experience building and operating large-scale distributed systems or data-intensive backend platforms.
  • Experience in one or more programming languages such as Go, Python, C++, or similar.
  • Experience in Linux system engineering and host networking.
  • Demonstrated knowledge of network protocols, fabric design, and performance optimization.
  • Proven ability to lead complex technical initiatives end-to-end in a multi-team environment.
  • Experience with system design skills with focus on scalability, reliability, and performance.
  • Experience with container platforms (Kubernetes) and microservices.

Nice To Haves

  • Experience supporting large-scale AI or HPC workloads.
  • Familiarity with LLM training frameworks and communication libraries (e.g., NCCL, MPI).
  • Experience with streaming systems (Kafka, Flink, Spark Streaming, or similar) and high-throughput data pipeline architectures.
  • Experience with performance benchmarking and profiling tools.
  • Experience with infrastructure automation or configuration management tools.
  • Demonstrated influence across organizations (tech lead, architect, principal/IC leadership roles).

Responsibilities

  • Network architecture and design for large-scale LLM training and inference workloads.
  • Design RoCE v2–based GPU interconnection fabrics for multi-rack and multi-pod GPU clusters
  • Define lossless Ethernet architectures (Clos / fat-tree / leaf-spine) optimized for RDMA
  • Select and validate 400G / 800G Ethernet switching platforms and NICs (ConnectX, BlueField, etc.)
  • Deep expertise in host-level and Kubernetes pod networking architectures, including enablement of high-performance features such as RDMA and GPU Direct.
  • Experience in host network performance tuning for large-scale collective communications, balancing latency, throughput, and congestion control.
  • Analyze system performance and diagnose complex cross-layer issues.

Benefits

  • We strongly believe in the well-being of our employees and their families. That is why we offer generous health and wellness programs and time away for employees of all levels.
  • LinkedIn is committed to fair and equitable compensation practices.
  • The pay range for this role is $181,000 to $297,000.
  • Actual compensation packages are based on several factors that are unique to each candidate, including but not limited to skill set, depth of experience, certifications, and specific work location.
  • This may be different in other locations due to differences in the cost of labor.
  • The total compensation package for this position may also include annual performance bonus, stock, benefits and/or other applicable incentive compensation plans.
  • For more information, visit https://careers.linkedin.com/benefits.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service