Assoc Eng, SRE Data Video

OptimumTown of Oyster Bay, NY

About The Position

We are seeking a Site Reliability Engineer (SRE I) to join our Video Platform Engineering team. In this role, you will support the operation, monitoring, and reliability of large-scale video delivery systems, including live streaming, video-on-demand (VOD), encoding, packaging, and content delivery networks (CDNs). As a Level 1 SRE, you will work closely with senior engineers to respond to incidents, perform routine operational tasks, and build foundational automation skills. This position is ideal for candidates with a background in systems or network administration who are eager to grow into video reliability engineering.

Requirements

  • Bachelor’s degree in Computer Science, Information Technology, Media Technology, or equivalent practical experience.
  • 0–2 years of experience in systems administration, IT operations, video engineering, or SRE-related work.
  • Basic understanding of Linux administration and troubleshooting.
  • Familiarity with scripting (Python, Bash, or PowerShell).
  • Exposure to monitoring/observability tools (Grafana, Prometheus, ELK, Splunk, DataDog, or similar).
  • General knowledge of networking fundamentals (TCP/IP, DNS, load balancing, HTTP).
  • Applicants must be authorized to work for ANY employer in the U.S.

Nice To Haves

  • Exposure to video delivery workflows (streaming, VOD, ABR packaging) is a plus but not required.
  • Awareness of streaming protocols (HLS, MPEG-DASH, RTMP) is a plus.
  • Experience with cloud environments (AWS, Azure, GCP) is desirable but not required.

Responsibilities

  • Monitor the health and performance of video services (live, linear, and on-demand) using observability tools.
  • Support day-to-day operations of video platforms including encoding, transcoding, packaging, origin servers, DRM, and CDN delivery.
  • Assist in troubleshooting service-impacting issues and escalating to senior engineers when needed.
  • Participate in on-call rotations under supervision, responding to alerts and contributing to incident resolution.
  • Execute runbooks for routine operational tasks, deployments, and platform maintenance.
  • Write and maintain basic automation scripts (Python, Bash, PowerShell) to reduce manual work.
  • Document incident response steps, troubleshooting guides, and standard operating procedures.
  • Collaborate with cross-functional teams (network, storage, CDN, and applications) to support service delivery.

Benefits

  • Pay is competitive and based on a number of job-related factors, including skills and experience.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service