Software Development Engineer, Machine Learning Networking Performance

AmazonSanta Clara, CA
$165,200 - $223,600Onsite

About The Position

AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. We support all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that ensure our customers have continual access to the innovation they rely on. You’ll join a diverse team of software, hardware, and network engineers, supply chain specialists, security experts, and operations managers. You’ll collaborate with people across AWS to help us deliver the highest standards for safety and security while providing seemingly infinite capacity at the lowest possible cost for our customers. Amazon’s network is a key differentiator for Amazon Cloud Computing and Web Services (AWS), enabling the global operation of thousands of applications across hundreds of thousands of servers worldwide. The AWS Networking team develops and operates the network platform for all of Amazon including e-commerce products and cloud computing solutions. This platform is industry-leading for its efficiency, performance, reliability and scale and it is critical to the success of all AWS customers. AWS Networking is looking for a Software Development Engineer to drive innovation in Machine Learning (ML) Network Performance. You will join a high impact team of senior engineers that own measuring and improving the performance of our hyperscale data center networks. You will develop metrics and performance benchmarking systems to deeply understand the performance of the ML network and you will develop and deliver innovative solutions that drive ever greater experiences for our ML customers. A successful candidate will have deep expertise with network routing, transport protocols, network hardware forwarding design and ML workloads and benchmarking. This role is a tremendous opportunity to invent and deliver game changing network solutions that directly benefit our ML customers. You will be part of a networking team that owns AWS ML Network. We design, develop, manage and operate the ML network in AWS. You have an opportunity to shape how one of the ML largest networks on the planet is going to be operated for at least the next decade. Our engineers, managers and leaders are innovators and builders at heart; come join us and become integral to the technology company that is the past, present and future of Cloud Computing. The ML Performance team specifically pursue the goal of facilitating high scale repeatable tests on ML low latency networks. We collaborate in finding the optimal network configurations for the highest performance for our customers.

Requirements

  • 3+ years of non-internship professional software development experience
  • 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • 1+ years of software development engineer or related occupational experience
  • 1+ years of designing and developing large-scale, multi-tiered, multi-threaded, embedded or distributed software applications, tools, systems, and services using: C#, C++, Java, or Perl experience
  • 1+ years of Object Oriented Design experience
  • Bachelor's degree or foreign equivalent in Computer Science, Engineering, Mathematics, or a related field
  • Experience programming with at least one software programming language

Nice To Haves

  • 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Bachelor's degree in computer science or equivalent

Responsibilities

  • Develop metrics and performance benchmarking systems to deeply understand the performance of the ML network.
  • Develop and deliver innovative solutions that drive ever greater experiences for our ML customers.
  • Invent and deliver game changing network solutions that directly benefit our ML customers.
  • Design, develop, manage and operate the ML network in AWS.
  • Facilitate high scale repeatable tests on ML low latency networks.
  • Collaborate in finding the optimal network configurations for the highest performance for our customers.

Benefits

  • health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage)
  • 401(k) matching
  • paid time off
  • parental leave
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service