Site Reliability Engineer

Gallup•San Francisco, CA

1d•$150,000 - $200,000•Hybrid

About The Position

Build Gallup's observability foundation and shift how we detect, respond to and prevent system issues before they affect customers. As a founding member of Gallup’s new site reliability engineering team, you’ll define and scale our observability strategy across engineering and bring reliability engineering principles — automation, observability and continuous improvement — to everything we build. You’ll unify different teams’ monitoring solutions into a cohesive, proactive approach, consolidate our tooling, build automated workflows and establish processes that help us catch problems before they become incidents. In this role, you’ll shape Gallup’s global technology platform to ensure the systems delivering analytics and insights to millions remain fast, resilient and always available. If you’re eager to drive resilience in systems that empower people and organizations worldwide, this is your opportunity — apply today.

Requirements

Bachelor's degree in computer science, MIS or a related field, or equivalent experience, required
At least three years of experience in site reliability engineering, DevOps or infrastructure roles with a focus on monitoring and observability required
Experience with observability and monitoring tools such as Dynatrace (preferred), Datadog, Grafana or similar platforms required
Experience with incident management tools like PagerDuty or similar alerting systems required
Strong understanding of AWS cloud infrastructure and how to monitor distributed systems required
Experience integrating monitoring and alerting systems with collaboration platforms like Slack required
Ability to work with application teams across multiple languages and frameworks (e.g., Java, .NET, Python) required
Knowledge of metrics, logging and tracing as pillars of observability required
Experience writing scripts or automation (e.g., Python, Bash, PowerShell) to support monitoring workflows required
A commitment to working on-site at Gallup’s San Francisco office at least three days a week required

Nice To Haves

Observability expertise: You've built or scaled monitoring and observability practices, not just maintained existing systems.
Tool consolidation experience: You've successfully unified fragmented monitoring solutions across multiple teams.
AI mindset: You reduce repetitive operational work through thoughtful automation and workflow design.
Incident response leadership: You've designed or improved incident management processes and know how to balance speed with thoroughness.
Communication and enablement: You go beyond building dashboards; you guide others in how to instrument their code and interpret metrics.
Experience with containerized applications and infrastructure as code preferred

Responsibilities

Establish the foundation of Gallup’s SRE function by defining standards, best practices and scalable systems that will grow with the organization
Build and evolve observability infrastructure using tools like Dynatrace, Datadog, Grafana and PagerDuty to monitor applications running on AWS
Design and implement automated alerting workflows that integrate directly with Slack
Establish incident response processes that integrate monitoring, alerting and team communication to reduce recovery time and improve service continuity
Create dashboards and metrics that give engineering teams real-time insight into application performance and system reliability
Identify opportunities for automation and design self-healing systems in partnership with DevOps engineers
Enable end-to-end monitoring and faster issue detection by partnering with application teams to embed observability into Java, .NET and Python services
Lead initiatives that help engineering teams adopt and use observability tools effectively
Identify patterns in system behavior that indicate potential issues before they affect customers

Benefits

Gallup offers a robust benefits package that includes medical, dental, vision, life and other insurance options; a fully vested 401(k) retirement savings plan with company matching; an employee stock ownership program; mass transit reimbursement; family-building benefits; an employee assistance program; and various reimbursements and activities that enhance our associates’ wellbeing.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume