About The Position

AWS Infrastructure Services owns the design, planning, delivery, and operation of all AWS global infrastructure. This team is responsible for keeping the cloud running by supporting all AWS data centers and the associated equipment. They work on challenging problems with thousands of variables impacting the supply chain and are looking for talented individuals to join their diverse team of engineers, specialists, and managers. The role involves collaborating across AWS to maintain high standards of safety and security while providing scalable capacity at a low cost. The team fosters an inclusive culture that encourages bold ideas and empowers employees to see them through to completion. Amazon Web Services (AWS) aims to be the world's infrastructure platform, demanding high quality and reliability from its services. The company is expanding rapidly, and this role is crucial in maintaining quality and reliability while innovating operational processes.

Requirements

  • Bachelor's degree, or a Master's degree and experience working in a large-scale networking environment
  • 1+ years of major internet routing protocols experience
  • 1+ years of working in a Linux/Unix environment experience

Nice To Haves

  • Knowledge of networking protocols, to include HTTP(S), DNS, and TCP/IP
  • Knowledge of network analysis fundamentals and robust troubleshooting
  • Experience dealing effectively with customers during problem resolution and operating efficiently under pressure, or experience troubleshooting and documenting findings
  • Experience prioritizing and handling multiple assignments at any given time while maintaining commitment to deadlines, or experience completing complex tasks quickly with little to no guidance and react with appropriate urgency to situations that require a quick turnaround
  • Strong Unix/Linux skills and ability to script in Python, shell, C or C++ is highly desirable

Responsibilities

  • Provide critical on-shift network operations support to Amazon.com customers to diagnose and respond to large-scale events.
  • Support and maintain next generation networks.
  • Deliver simple, sustainable and repeatable solutions and processes.
  • Partner with the broader Technical Operations organization to reduce operational burden.
  • Work closely with Network Engineering & Deployment teams to ensure operational readiness for new deployments.
  • Drive standards across the network and ensure compliance to those standards and policies.
  • Participate and drive impact mitigation during large-scale events utilizing an established Event Management process.
  • Drive event deep dives for large-scale events, deliver high-quality documentation for the events and drive corrective actions to completion.
  • Improve detection mechanisms by designing and implementing new alerts.
  • Identify and troubleshoot recurring platform issues and effectively engage with mid and senior-level engineering teams for full resolution.
  • Create and review documentation and process regarding recurring issues, new standard operating procedures, knowledge transfer material, etc.
  • Troubleshoot and interconnectivity issues, including troubleshooting of network device configuration and low-level application interaction.
  • Identify and drive opportunities to make tasks repeatable through creation and maintenance of scripts and tools.
  • Effectively contribute towards hiring and developing others in the team.

Benefits

  • health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage)
  • 401(k) matching
  • paid time off
  • parental leave
  • sign-on payments
  • restricted stock units (RSUs)
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service