About The Position

Amazon Leo is Amazon’s low Earth orbit satellite network. Our mission is to deliver fast, reliable internet connectivity to customers beyond the reach of existing networks. From individual households to schools, hospitals, businesses, and government agencies, Amazon Leo will serve people and organizations operating in locations without reliable connectivity. Are you passionate about building and operating networks at unprecedented scale? Join our Network Reliability team to shape the future of a global Service Provider network supporting 3,236 satellites and terrestrial infrastructure. You'll be at the forefront of ensuring reliable connectivity that directly impacts customer experience across our constellation. You'll work at the intersection of space-based and terrestrial networking, tackling unique challenges that few engineers ever encounter. Your day might include analyzing network telemetry from thousands of satellites, collaborating with hardware teams on next-generation designs, or developing automation that prevents issues before they impact customers. You'll have the autonomy to identify problems, propose solutions, and drive them to completion while working with talented engineers across the organization. Our Network Reliability team operates one of the most complex networks in existence—connecting ground stations, data centers, and a constellation of satellites to deliver seamless connectivity. We're a collaborative group of curious problem-solvers who thrive on technical challenges and are committed to operational excellence. We value innovation, encourage experimentation, and believe the best solutions come from diverse perspectives and rigorous technical debate.

Requirements

  • 5+ years of major internet routing protocols experience
  • 5+ years of working in a Linux/Unix environment experience
  • 5+ years of automation scripting using Python, Bash, Shell and/or Perl experience
  • Experience with network troubleshooting tools (telnet, test-netconnection, tracert, tracetcp, iperf, ntttcp, dig, and packet capture tools), or experience in managing and troublshooting network and experience with automation and any version control tools
  • Experience with incident management and troubleshooting in production environments

Nice To Haves

  • 2+ years of work in large-scale networking environments experience
  • Experience contributing to the definition and implementation of automation opportunities within an operations environment
  • Knowledge of satellite communication systems and RF networking

Responsibilities

  • Build and maintain processes, procedures, and tooling to monitor, troubleshoot, and operate a global Service Provider network at scale
  • Develop comprehensive understanding of end-to-end network architecture, including control and data planes across satellite constellation and terrestrial systems
  • Establish and enforce incident management and problem management processes to ensure rapid resolution
  • Engage with service owners to resolve incidents and coordinate cross-functional efforts
  • Prioritize and track initiatives across multiple teams to systematically eliminate recurring problems
  • Develop proactive monitoring and alerting solutions to continuously improve network reliability and customer experience

Benefits

  • health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage)
  • 401(k) matching
  • paid time off
  • parental leave
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service