Senior Software Engineer - HyperScale Engineering

NVIDIA•Seattle, CA

23d

About The Position

NVIDIA is looking for an experienced Senior Software Engineer to expand the US-based Networking Hyperscale Engineering Team. Are you craving an opportunity to work directly with top-tier cloud and AI customers, co-develop software that powers their AI superclusters, and influence NVIDIA’s NIC software roadmap? In this role you will do just that for NVIDIA’s high-performance networking stack spanning Linux kernel, RDMA/RoCE, DPDK, DOCA, NCCL, and NIC firmware. You will be among the first to design and optimize the NIC and communication paths for our next-generation GPU and NIC platforms and help define their role in the modern AI data center. You’ll work closely with some of the best SDK, driver, firmware, and GPU/NIC architects in the industry, as well as domain experts in large-scale training, collectives, and systems performance.

Requirements

12+ years overall experience in a similar or related systems / networking software role.
A Bachelor’s, Master’s or PhD in Software Engineering, Computer Science, Computer Engineering, Electrical Engineering, or a related field (or equivalent experience).
Deep C/C++ expertise, strong Linux systems knowledge, and hands-on experience with kernel networking / RDMA / NIC drivers or DPDK.
Proven experience developing and debugging network operating systems (NOS) and routing/switching protocols used in AI data centers (for example BGP, ECMP, EVPN/VXLAN).
Practical experience with DOCA, NIC firmware interfaces, or other hardware-accelerated networking stacks for large-scale systems.
Excellent communication skills and a track record of effective collaboration with developers, partners, and customers in dynamic environments.

Nice To Haves

Deep knowledge of Linux kernel / systems internals, SoC / SmartNIC / NIC embedded systems, and data center switches and NOS.
Hands-on experience with RDMA/RoCE, GPU-related networking (for example GPUDirect RDMA), and high-performance, low-latency data paths.
Background optimizing NCCL or other distributed training stacks on large GPU clusters for throughput and tail latency.
Experience working with hyperscalers or major cloud providers on strategic, performance-critical AI networking deployments.
Contributions to open-source networking, RDMA, DPDK, kernel, CUDA/NCCL, or related ecosystems.

Responsibilities

Co-developing NIC software and communication paths with strategic, top-tier customers to enable and scale large AI superclusters.
Designing and implementing high‑performance C/C++ components on Linux using DPDK, kernel-bypass techniques, and RDMA/RoCE.
Developing and integrating kernel, driver, and NIC firmware features to improve throughput, latency, and reliability for AI workloads.
Working closely with NCCL and distributed training teams to tune end-to-end collectives performance over NVIDIA networking at scale.
Owning complex performance and functionality debug with customers and representing the team in cross-org architecture discussions.