Software Engineer, Engineering Ops

RidgelineNew York, NY
32d$146,000 - $172,000Hybrid

About The Position

As a Senior Engineer on the Engineering Operations team, you'll focus on building and automating the operational foundation that keeps Ridgeline running efficiently, reliably, and transparently. This team owns incident response coordination, operational readiness, observability and telemetry, financial efficiency tooling (FinOps), and compliance frameworks. You'll work across engineering, infrastructure, and product teams to ensure the systems we build are measurable, efficient, and easy to operate at scale. This is a highly cross-functional role where you'll collaborate with SRE and product engineers to improve visibility, reduce toil, and build confidence in the systems that power Ridgeline. You'll leverage cutting-edge technologies-including AI tools like GitHub Copilot and ChatGPT-to enhance automation and accelerate innovation. At Ridgeline, how we work matters as much as what we build. Ridgeliners act like owners, choose growth over comfort, and communicate with transparency. We assume positive intent, bias toward action, and bring solutions-not just problems. We celebrate wins, learn from setbacks, and thrive in a resilient, collaborative, high-performing culture.

Requirements

  • 5+ years of experience in SRE, DevOps, or Production Engineering roles
  • Strong background in operational automation using tools like Python, Go, or Bash
  • Deep understanding of observability stacks such as Datadog, Prometheus, ELK, or OpenTelemetry
  • Practical experience building FinOps dashboards, cost tagging strategies, and anomaly detection workflows
  • Hands-on experience with incident response, root cause analysis, and post-incident process improvement
  • Solid cloud infrastructure experience (AWS preferred) and infrastructure-as-code tools such as Terraform or CDK
  • Clear and concise communicator who can partner across engineering, product, and business teams
  • Willingness to learn and adopt emerging technologies, including AI and automation tools
  • Strong ownership mindset with a drive for continuous improvement

Nice To Haves

  • Experience defining and running DR exercises and managing DR documentation
  • Background in service catalog design or system metadata modeling
  • Familiarity with compliance frameworks and audit-readiness for SLA or cost reporting

Responsibilities

  • Design and implement automation for critical operational workflows, including tenant provisioning, patch coordination, and configuration health
  • Drive incident response and post-incident improvement through runbooks, documentation, and root cause automation
  • Lead the design and implementation of unified observability frameworks that span system health, financial metrics, and organizational effectiveness
  • Build and maintain dashboards and telemetry pipelines that support FinOps cost optimization and policy compliance
  • Partner with engineering and infrastructure teams to define and track SLAs, RTOs, and RPOs across services
  • Maintain Ridgeline's disaster recovery documentation and lead regular DR exercises with measurable outcomes
  • Define and manage system and service manifests that document ownership, dependencies, and operational metadata
  • Contribute to shared operational libraries, templates, and standards that make integration easier across Ridgeline's platform
  • Collaborate with a diverse group of Ridgeliners to promote best practices in transparency, operational excellence, and continuous learning

Benefits

  • unlimited vacation
  • educational and wellness reimbursements
  • $0 cost employee insurance plans

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Industry

Publishing Industries

Education Level

No Education Listed

Number of Employees

251-500 employees

© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service