Senior Engineering Manager, Engineering Operations

RidgelineReno, NV
36d$200,000 - $235,000Hybrid

About The Position

As the Senior Engineering Manager for Engineering Operations at Ridgeline, you will lead a high-impact team that ensures our platform is reliable, observable, and cost-effective at scale. This role is responsible for defining and executing our strategy around incident response, FinOps, and system-wide telemetry, enabling Ridgeline's engineering and business leaders to make critical decisions with confidence. Your work will improve visibility, reduce friction, and unlock proactive insights across the organization. You'll leverage cutting-edge technologies-including AI tools like GitHub Copilot and ChatGPT-to elevate operational excellence and drive efficiency throughout Ridgeline's technical ecosystem. At Ridgeline, how we work matters as much as what we build. Ridgeliners act like owners, choose growth over comfort, and communicate with transparency. We assume positive intent, bias toward action, and bring solutions-not just problems. We celebrate wins, learn from setbacks, and thrive in a resilient, collaborative, high-performing culture.

Requirements

  • 10+ years of experience in SRE, infrastructure, or technical operations, including 3-6 years in a leadership role
  • Expertise in observability platforms like Datadog, Prometheus, ELK, or OpenTelemetry
  • Experience integrating technical telemetry with business metrics and cost models (e.g., cost-per-customer, MTTR, unit metrics)
  • Proven success scaling incident management frameworks and post-mortem processes
  • Proficiency with SQL, data modeling, or BI tools like Looker or Tableau
  • Strong collaboration skills and the ability to communicate technical insights to executive audiences
  • Calm, effective communicator who performs well under pressure and in incident response environments
  • Passion for continuous improvement, resilience, and mentorship

Nice To Haves

  • Prior experience in the FinTech or SaaS industry
  • Familiarity with AI/ML solutions in observability and operations
  • Experience managing infrastructure in a cloud-native environment (e.g., AWS, Kubernetes)

Responsibilities

  • Lead and evolve Ridgeline's observability and telemetry ecosystem to ensure critical metrics are trustworthy, actionable, and widely adopted
  • Define and execute the company-wide incident management strategy, enabling rapid response and continuous learning
  • Drive cost optimization and forecasting by scaling our FinOps practice with integrated usage and financial telemetry
  • Collaborate with Site Reliability Engineering (SRE) to create cross-system observability standards and ensure consistency in logs, metrics, tracing, and cost data
  • Build a unified metrics platform that combines operational, financial, and organizational performance data for real-time executive decision-making
  • Identify, automate, and eliminate high-frequency operational tasks using AI, reducing toil and increasing focus on continuous improvement
  • Define, track, and communicate KPIs for system reliability, operational efficiency, and infrastructure cost-effectiveness
  • Mentor and grow a diverse team of engineers, fostering a culture of ownership, learning, and transparency

Benefits

  • unlimited vacation
  • educational and wellness reimbursements
  • $0 cost employee insurance plans
  • Company Stock Plan

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Industry

Publishing Industries

Education Level

No Education Listed

Number of Employees

251-500 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service