Senior Director of Engineering, SRE

AlphaSense
4d$186,000 - $255,000

About The Position

We are looking for a Senior Director of Site Reliability Engineering (SRE) to define and lead reliability and operational excellence across AlphaSense’s products and platforms. This is a high-seniority, technically demanding role with broad organizational impact. Operating within AlphaSense’s “you build it, you run it” culture, success is defined by shaping how reliability and operations are practiced across a large engineering organization, enabling hundreds of engineers to build, run, and support production systems with confidence. The role leads and drives the evolution of Site Reliability Engineering at AlphaSense, with direct responsibility for the SRE team and significant influence on reliability practices across engineering. As the function matures, the role will scale its leadership capacity, while maintaining a globally distributed, follow-the-sun SRE operating model across the US, EU, and India.

Requirements

  • Several years of Senior leadership experience in Site Reliability Engineering capacity
  • Deep knowledge of SRE principles and practices (SLIs/SLOs, error budgets, reliability economics)
  • Experience building self-service systems through platform engineering
  • Strong background in distributed systems and microservices
  • Production experience operating Kubernetes-based platforms
  • Solid understanding of cloud-native networking fundamentals
  • Experience running systems in multi-cloud environments (AWS and at least one of GCP or Azure)
  • Proven success scaling SRE practices across large engineering organizations
  • Demonstrated experience building, mentoring, and developing high-performing SRE teams
  • Ability to grow and sustain an inclusive, resilient engineering culture
  • Experience operating in a “you build it, you run it” culture
  • Ability to lead through influence and partnership

Responsibilities

  • Lead reliability and operational excellence across AlphaSense’s platforms and products
  • Scale SRE practices in a “you build it, you run it” engineering organization
  • Lead and grow a follow-the-sun SRE team across multiple time zones
  • Build, mentor, and develop high-performing SRE engineers
  • Own incident management, on-call operations, and post-incident learning
  • Cultivate an awareness and culture of reliability throughout the engineering organization
  • Set direction for observability and operational tooling
  • Enable teams to operate production systems safely and confidently
  • Embed reliability into the whole software delivery lifecycle in collaboration with Product, Platform, Cloud, and Security
  • Reduce systemic risk through toil reduction and continuous improvement
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service