Senior Network Reliability Engineer

RobloxSan Mateo, CA
22d

About The Position

Join our team as a Senior Network Reliability Engineer on the Roblox's Infrastructure "Physical Networking - Network Reliability Systems" team. You will help improve the reliability and efficiency of Roblox's global physical network infrastructure.

Requirements

  • Over 4 years of experience developing physical network focused automation, which could include network automation, monitoring systems, network device configuration management, or ZTP workflows.
  • Experience managing and troubleshooting large-scale network deployments.
  • Strong knowledge of network protocols (TCP/IP, UDP, DHCP, DNS), with hands-on experience in IPv4 and IPv6.
  • Experience working in multi-vendor environments with hands-on exposure to networking hardware and software.
  • Familiarity with at least one programming language (Python, Go).
  • Experience in Linux with an understanding of the networking stack.
  • BA/BS degree in a relevant engineering field/equivalent experience.

Nice To Haves

  • Familiarity with Quality-of-Service (QoS) is nice to have.
  • Familiarity with BGP/MPLS/IS-IS is nice to have.
  • Experience with open-source hardware and software solutions is nice to have.

Responsibilities

  • Support Network Engineers through implementation of automation and tooling to improve network operational efficiency and to reduce the need for manual intervention in tasks such as network device configuration, network troubleshooting, and incident response.
  • Build, maintain, and support automation solutions for streamlined network deployments across backbone, edge, and core network infrastructure.
  • Work closely with the Network Reliability Systems (NRS) SWE team to achieve comprehensive automation, establish critical dashboards, and define/refine alerts to optimize telemetry and alerting coverage.
  • Collaborate with various Roblox Infrastructure groups to maintain a highly available and highly performant network infrastructure.
  • On-Call Support: Participate in weekly on-call rotations to provide 24x7 support for the global, large-scale production network infrastructure.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service