Director of Site Reliability Engineering (SRE)

Backblaze External Website
Remote

About The Position

We are seeking an ambitious and accomplished Director of SRE to join our Cloud Operations leadership team. In this role, you will lead the front-line teams responsible for delivering mission-critical SRE production services. You will drive initiatives to identify, prioritize, and execute opportunities that enhance our core operational competencies and create a broad organizational impact. As a champion of engineering excellence, you’ll focus on performance measurement, incident/change management, problem resolution, and process discipline. This is an exciting opportunity to significantly impact our company's growth trajectory and shape the future of our global production footprint. This position may be remote, but our team loves face-to-face collaboration. Most of our leaders are spread across the country, so we would like them to visit during our organized workshops that take place across the US. This role will involve managing a globalized workforce across multiple time zones.

Requirements

  • Proven experience in a similar leadership role within the MSP or Infrastructure-as-a-Service industry.
  • Excellent collaboration and communication skills, including building high-performing teams and effectively interacting with colleagues across Backblaze and team members.
  • Strong analytical and problem-solving abilities, with a data-driven approach to decision-making
  • Significant experience in cloud-scale data center systems and services
  • Significant experience in managing mission-critical operations of complex global infrastructure.
  • A passion for process improvement and project management at scale
  • 6+ years of management experience, with at least 3 years at the Director level
  • 5+ years of hands-on technical experience in a field related to the team’s focus
  • Ability to travel domestically and internationally as needed
  • Remote - Continental USA OK, with experience managing remotely

Nice To Haves

  • Six-Sigma training and/or certification a plus

Responsibilities

  • Lead a globally distributed team of 15+ highly technical teammates:
  • Provide 24/7 services for SRE
  • Own the single source of truth for the state of production
  • Centrally manage all aspects of incident and change management
  • Maintain a culture of continuous improvement, leveraging operational data to prioritize work across teams.
  • Be customer focused, and have a strong bias to action therein.
  • Collaborate closely with Customer Support to provide seamless world-class support
  • Collaborate with Supply Chain to manage proper levels of inventory
  • Lead & coordinate strategic initiatives to evolve and improve production support, incident/change/asset management.
  • Liaise with Vendor Management and Legal to manage critical contract renewal cycles
  • Establish department-level objectives, policies, and procedures, creating OKRs or other measurements as applicable
  • Recruit & coach the team to support Backblaze and individual career objectives
  • Build strong cross-functional relationships, most notably with Infrastructure Engineering, Customer Support, and Data Center Operations.
  • Manage department budget
  • Be an engaged, visible, and admired leader

Benefits

  • RSU grants for full-time employees
  • Annual Company bonus plan
  • Healthcare for family, including dental and vision
  • 401K
  • ESPP program
  • Flexible vacation policy
  • Maternity & paternity leave
  • MacBook Pro for work plus a generous stipend to personalize your workstation
  • Childcare bonus (human children only)
  • Fertility treatment and support
  • Learning & development program
  • Commuter benefits
  • A culture that supports a healthy work-life balance
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service