Sr. Director, Platform Engineering & SRE

Ministry BrandsAlpharetta, GA

About The Position

As Sr. Director, Platform Engineering & SRE, you will build and lead the function responsible for the reliability, performance, and operational excellence of the Ministry Brands platform. You will own site reliability engineering, observability, production operations, and cloud engineering across our multi-cloud SaaS portfolio — establishing the practices, tooling, and standards that keep our products available and performant for the organizations we serve. This is a hands-on leadership role at the center of our most important technical priority: platform stability. You will define and drive measurable improvements in availability and incident response, stand up a modern SRE discipline, and partner closely with R&D, Product, and Security leaders to embed reliability into how we build and operate software. You will be accountable to executive leadership for platform availability and performance.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent experience
  • 10+ years of overall experience in software, infrastructure, or platform engineering
  • 6+ years in engineering leadership or management roles
  • Demonstrated track record building or scaling a Site Reliability Engineering or Platform Engineering function and improving availability/reliability outcomes
  • Deep, hands-on cloud experience at SaaS scale (Azure and/or AWS), including infrastructure-as-code and CI/CD
  • Strong background across: Site Reliability Engineering, Observability & Monitoring, Cloud & Infrastructure Engineering, Incident & Performance Management, Capacity Planning, and Production Operations

Nice To Haves

  • Experience operating multi-cloud and multi-tenant SaaS environments
  • Hands-on implementation of SLO/error-budget frameworks and modern observability tooling (e.g., Datadog, Grafana, Prometheus, OpenTelemetry)
  • Experience standing up reliability in a distributed or embedded (product-team) model
  • Exposure to SOC 2 and PCI DSS 4.0.1 control evidence at the infrastructure layer
  • Background in a private-equity-backed or high-growth SaaS environment
  • Demonstrated business acumen and sound decision-making in complex, multi-product environments

Responsibilities

  • Establish and own service-level objectives (SLOs), service-level indicators (SLIs), and error-budget policy across the product platform
  • Lead incident command, on-call rotation, escalation, and a blameless postmortem culture; drive measurable reduction in MTTR and change-failure rate
  • Set reliability standards and partner with embedded reliability engineers in R&D product teams to apply them at the point of system design
  • Drive availability toward enterprise targets and own the reliability roadmap and its reporting to executive stakeholders
  • Build and operate the observability platform — metrics, logs, traces, and alerting — and define the golden signals and dashboards used across products
  • Lead capacity planning, performance engineering, and operational-readiness reviews for new and existing services
  • Own production operations practices, runbooks, and escalation workflows that improve transparency, stability, and stakeholder communication
  • Deliver metrics-based reporting on platform availability and performance
  • Lead cloud engineering across our multi-cloud footprint (Azure, AWS, GCP), balancing reliability, performance, security posture, and cost
  • Own infrastructure-as-code, CI/CD platform standards, and the internal developer platform that product teams build on
  • Drive consolidation and standardization of fragmented infrastructure and pipeline tooling
  • Partner with Security to implement and evidence platform-layer controls in support of SOC 2 and PCI DSS objectives
  • Define team culture and objectives aligned to Enterprise IT & Security strategic goals; build, coach, and develop the Platform Engineering & SRE team
  • Build and maintain strong partnerships with R&D, Product, Security, and IT leaders
  • Develop and manage the platform engineering budget, balancing run-the-business needs with strategic investment, and author clear business cases for technology investments
  • Manage key cloud and tooling vendor relationships in partnership with IT and Procurement
  • Present updates, metrics, and recommendations to both technical and business stakeholders

Benefits

  • Robust healthcare options – Options include a plan that is 100% covered by Ministry Brands for employee only coverage as well as a generous HSA contribution by the company.
  • Flexible paid time off
  • Flexible work schedules
  • PTO for vacation
  • Up to 80 hours of paid sick/safe leave
  • 11.5 days of fully paid holidays
  • Paid parental leave
  • Mental health support through an Employee Assistance Program
  • Professional development reimbursement
  • Employee Recognition & Rewards through Nectar
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service