Software Development Engineer, ML Infrastructure Team

Amazon.comSeattle, WA
83d$129,300 - $223,600

About The Position

Want to help drive the success of Machine Learning technologies at AWS? Do you have the skills and motivation to build automation that supports the success of peer teams? We want to talk to you! We seek a Software Development Engineer for the Machine Learning (ML) Infrastructure team to build the tools that are used to guarantee top performance of AWS ML and High Performance Computing (HPC) technologies developed by our organization. Bring your exceptional knowledge of CI/CD automation, ML and HPC benchmarks and applications to bear on the cutting-edge software we develop. Join us as we expand the AWS offerings for AI, including Trainium, Neuron and the Elastic Fabric Adapter (EFA).

Requirements

  • 3+ years of non-internship professional software development experience
  • 2+ years of non-internship design or architecture (design patterns, reliability and scaling) of new and existing systems experience
  • Experience with CI/CD pipelines build processes
  • Experience using Linux, demonstrating proficiency with associated tools or languages
  • Experience coding in Python, Typescript, CDK

Nice To Haves

  • 3+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
  • Bachelor's degree in computer science or equivalent

Responsibilities

  • Be an autonomous engineer on a team that builds and maintains the infrastructure that monitors and reports on functionality and performance of massive testing workloads run at scale.
  • Use internal Amazon CI/CD tools, Linux, and public AWS products to automate the delivery of our software to customers, saving developer time.
  • Write Python code that effortlessly spools up large clusters and runs benchmarks and applications for ML and HPC workloads.
  • Use AWS Managed Grafana and Athena to digest the massive amount of performance data generated by these workloads and create dashboards for developers and stakeholders.
  • Invent automatic mechanisms to alert developers to functional and performance regressions so they never reach customers.
  • Manage the complexity of infrastructure that covers many instance types, software stacks, Linux operating systems, cutting-edge releases and make it easy to evolve.

Benefits

  • Equity and sign-on payments as part of total compensation package
  • Full range of medical, financial, and/or other benefits

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Industry

General Merchandise Retailers

Education Level

Bachelor's degree

Number of Employees

5,001-10,000 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service