Software Engineer II

The Walt Disney Company•Glendale, CA

56d•$117,500 - $165,000•Hybrid

About The Position

Disney Entertainment and ESPN Product & Technology is a global organization focused on building and advancing the technological backbone for Disney’s media business. The team combines technology with creativity to create world-class products, enhance storytelling, and drive innovation and scalability. They work across The Walt Disney Company’s media portfolio, impacting millions of users globally. Technologists in this area design and build products and platforms for media, advertising, and distribution. The products and brands, including Disney+, Hulu, and ESPN, are significant touchpoints for fans. The company fosters innovation by developing groundbreaking products and techniques to solve complex technical problems. Product Engineering is a unified team responsible for the engineering of Disney Entertainment & ESPN digital and streaming products and platforms, covering areas like product engineering, media engineering, quality assurance, personalization, commerce, lifecycle, and identity. The Observability & Insights group specifically ensures Disney Streaming’s distributed systems are reliable and performant, proactively preventing negative customer impact. This team translates operational and performance data into actionable insights to inform decisions and promote a culture of accountability and learning. By automating workflows and reducing toil, they aim to accelerate developer productivity and demonstrate that stability drives innovation. They build telemetry, dashboards, alerting, insights pipelines, and developer experience tooling to help engineers understand system health and act quickly.

Requirements

3+ years of experience in software engineering or equivalent training.
Experience developing and testing features within existing codebases.
Proficiency in one or more backend languages such as Go, Java, Python, or Node.js.
Experience with distributed systems, cloud infrastructure (AWS), and Kubernetes.
Understanding observability fundamentals: metrics, logs, distributed tracing, alerting, SLIs/SLOs.
Ability to break down features into tasks and communicate clearly about technical issues and decisions.
Ability to participate in architecture discussions and contribute to code reviews.

Nice To Haves

Experience with modern observability platforms (Grafana, Prometheus, OpenTelemetry, Datadog, New Relic, OpenSearch).
Familiarity with building telemetry or real-time data pipelines.
Experience participating in incident response or reliability engineering processes.

Responsibilities

Build and Evolve Observability Tooling and Platforms: Develop and enhance tools for metrics, logs, traces, event correlation, insights dashboards, and monitoring services used across all Disney Streaming service teams.
Write clean, maintainable code that integrates existing observability frameworks while continually improving performance, reliability, and scalability.
Implement features within telemetry ingestion pipelines and storage layers that support high-volume, low latency data processing.
Automate infrastructure provisioning and management using Infrastructure as Code (IaC) tools.
Improve System Visibility and Operational Insights: Collaborate with service engineering teams to design and instrument consistent, high-quality telemetry that reflects system behavior, business events, and performance indicators.
Build dashboards, automated reports, and insightful visualizations that help engineers detect anomalies early and understand root causes faster.
Work with incident response partners to refine and automate alerting mechanisms to reduce noise while improving detection fidelity.
Participate in Architecture, Design, and Code Quality Processes: Contribute to technical design documents for new observability features and improvements, ensuring alignment with architectural principles and long-term platform goals.
Participate in peer code reviews, offering constructive feedback and learning from more senior engineers’ approaches.
Help evaluate technical tradeoffs and propose improvements to existing observability patterns used across the organization.
Support Reliability, Incident Response, and Continuous Improvement: Assist in improving service reliability by leveraging observability tools, helping to uncover gaps, improve instrumentation, and drive more effective system behaviors.
Contribute to post-incident reviews by proposing durable changes to telemetry, alerting, or insights workflows according to our team’s best practices and standards.
Identify opportunities to automate manual developer or SRE workflows, reducing toil and improving peer velocity.
Collaborate and Take Ownership Within the Team: Break down incoming features into actionable engineering tasks with clear deliverables and timelines.
Document changes, patterns, and best practices to improve clarity, maintainability, and onboarding for other engineers.
Begin taking ownership of specific subsystems or components within the observability platform, with support from senior engineers.
Support and mentor interns or early-career team members in areas where you have growing expertise.

Benefits

A bonus and/or long-term incentive units may be provided as part of the compensation package
full range of medical, financial, and/or other benefits, dependent on the level and position offered.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume