Assoc Eng, SRE Data Video

Optimum•Town of Oyster Bay, NY

About The Position

We are seeking a Site Reliability Engineer (SRE I) to join our Video Platform Engineering team. In this role, you will support the operation, monitoring, and reliability of large-scale video delivery systems, including live streaming, video-on-demand (VOD), encoding, packaging, and content delivery networks (CDNs). As a Level 1 SRE, you will work closely with senior engineers to respond to incidents, perform routine operational tasks, and build foundational automation skills. This position is ideal for candidates with a background in systems or network administration who are eager to grow into video reliability engineering.

Requirements

Bachelor’s degree in Computer Science, Information Technology, Media Technology, or equivalent practical experience.
0–2 years of experience in systems administration, IT operations, video engineering, or SRE-related work.
Basic understanding of Linux administration and troubleshooting.
Familiarity with scripting (Python, Bash, or PowerShell).
Exposure to monitoring/observability tools (Grafana, Prometheus, ELK, Splunk, DataDog, or similar).
General knowledge of networking fundamentals (TCP/IP, DNS, load balancing, HTTP).
Applicants must be authorized to work for ANY employer in the U.S.

Nice To Haves

Exposure to video delivery workflows (streaming, VOD, ABR packaging) is a plus but not required.
Awareness of streaming protocols (HLS, MPEG-DASH, RTMP) is a plus.
Experience with cloud environments (AWS, Azure, GCP) is desirable but not required.

Responsibilities

Monitor the health and performance of video services (live, linear, and on-demand) using observability tools.
Support day-to-day operations of video platforms including encoding, transcoding, packaging, origin servers, DRM, and CDN delivery.
Assist in troubleshooting service-impacting issues and escalating to senior engineers when needed.
Participate in on-call rotations under supervision, responding to alerts and contributing to incident resolution.
Execute runbooks for routine operational tasks, deployments, and platform maintenance.
Write and maintain basic automation scripts (Python, Bash, PowerShell) to reduce manual work.
Document incident response steps, troubleshooting guides, and standard operating procedures.
Collaborate with cross-functional teams (network, storage, CDN, and applications) to support service delivery.