Site Reliability Engineer lll

AbsenceSoft
5dRemote

About The Position

We're looking for a senior Site Reliability Engineer to join our small, high-ownership SRE team. In this hands-on individual contributor role, you'll own the reliability, scalability, and security of AbsenceSoft's production infrastructure on AWS — supporting a B2B SaaS platform that processes sensitive employee leave data for enterprise customers. You'll work closely with infrastructure, application engineering, product leadership, and cross-functional partners in Security and Compliance, with a clear path to grow toward a Tech Lead opportunity as our team and platform continue to mature.

Requirements

  • 5+ years of experience in SRE, DevOps, or a related engineering role, with advanced hands-on expertise in AWS production environments and core services including Lambda, ECS, S3, ALB, and GuardDuty.
  • Strong proficiency in infrastructure-as-code tooling such as Terraform, CloudFormation, or CDK, paired with experience building and operating CI/CD pipelines using Jenkins and GitHub.
  • Proficiency in Python, Go, or Bash for automation, alongside hands-on experience with Datadog or a comparable observability platform for monitoring, alerting, and log management.
  • Demonstrated experience leading incident response in complex, distributed systems, with working knowledge of SLO/SLI frameworks, error budgets, and disaster recovery planning against defined RTO/RPO objectives.
  • Familiarity with SOC 2 compliance frameworks and experience contributing to audit readiness, access controls, and security control evidence collection.
  • A collaborative, ownership-driven mindset with strong communication skills, a passion for mentoring junior engineers, and a commitment to reducing toil through automation and AI-assisted tooling.

Responsibilities

  • Architect, implement, and operate scalable, resilient, and secure AWS infrastructure — including GuardDuty, Lambda, EventBridge, SNS, SES, S3, ALB, and ECS container workloads.
  • Lead infrastructure-as-code initiatives to ensure all environments are reproducible, auditable, and consistently configured in support of SOC 2 change management controls.
  • Design, maintain, and improve CI/CD pipelines using Jenkins and GitHub to enable reliable, repeatable software delivery — partnering with application engineering to reduce release risk and increase deployment frequency.
  • Own the Datadog observability platform, including dashboards, monitors, alerting thresholds, and log management; define and maintain SLOs, SLIs, and error budgets to guide reliability investment and reduce alert fatigue.
  • Serve as a senior technical responder across the full incident lifecycle — detection, containment, resolution, and postmortem — within a shared on-call rotation, and lead blameless postmortems to drive down incident frequency and MTTR.
  • Refine, implement, and test disaster recovery plans to meet RTO/RPO objectives, while contributing to SOC 2 audit readiness with a focus on access controls, incident response, and risk mitigation.
  • Mentor junior SREs through code reviews, incident pairing, and documentation of runbooks and engineering standards.

Benefits

  • Impact that matters. You’ll do work that shapes the future of the modern workplace
  • Flexibility and trust. We’re remote-first and results driven. You’ll have the freedom and flexibility to do your best work, wherever you do it best.
  • Growth and development. We believe the best work happens when people are growing. You’ll have access to learning resources, leadership programs, and real opportunities to take on new challenges and expand your impact.
  • Competitive rewards. We offer comprehensive benefits, a performance-based bonus program, and equity opportunities – because when we grow, you should too.
  • Time for life. Recharge and reconnect with flexible time off, paid holidays, and flexible leave programs designed to support every season of life.
  • Belonging and balance. We’re building an inclusive culture where every voice is valued, collaboration is celebrated, and success is shared.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service