About The Position

We're seeking a Site Reliability Engineer (SRE) to help our clients build reliable, observable, and secure production systems. In this role, you will work closely with client engineering and operations teams to improve system reliability, reduce toil, and build the operational foundations — deployment pipelines, monitoring, incident management, and infrastructure — that keep production systems running smoothly. Note that while we specialize in healthcare and regulated industries, not all our projects are in these fields, so you may work across different domains from time to time.

Requirements

  • Have 5+ years of experience in infrastructure, DevOps, or site reliability engineering
  • Have hands-on experience with AWS or Azure infrastructure and infrastructure-as-code tools (Terraform, CloudFormation, or equivalents)
  • Have strong experience with CI/CD pipelines (GitHub Actions, ArgoCD, Jenkins, or equivalents) and deployment automation
  • Have experience with observability tools (Prometheus, Grafana, Datadog, CloudWatch, or equivalents) and incident management processes
  • Are familiar with security best practices for cloud infrastructure, including network security, IAM, encryption, and vulnerability management
  • Have excellent communication skills and can explain infrastructure and reliability concepts to varied stakeholders
  • Are adaptable, self-directed, and comfortable in dynamic client environments
  • Can explain reliability and security trade-offs and connect them to business needs.

Nice To Haves

  • Experience in client-facing roles such as consulting, implementation engineering, or advisory work.
  • Worked in healthcare or other heavily regulated industries.
  • Software development experience beyond scripting — experience building features, APIs, or applications.
  • Experience with container orchestration (Kubernetes, ECS) and cloud-native tooling.
  • Built infrastructure automation using scripting (Python, Bash) or workflow tools.
  • Hold relevant certifications (AWS DevOps Professional, AWS Solutions Architect, CKA, or similar).

Responsibilities

  • Design and maintain resilient, secure cloud infrastructure using infrastructure-as-code; implement security controls, hardening standards, and compliance guardrails across client environments.
  • Design and implement monitoring, alerting, and logging systems; lead incident response and post-mortem processes; define and track SLOs and SLIs.
  • Automate deployment pipelines, infrastructure provisioning, and operational runbooks to reduce toil and improve system resilience.
  • Own the reliability and infrastructure workstream, guide client engineering teams on SRE practices, and contribute to architectural decisions.
  • Share SRE expertise with colleagues, contribute to internal tooling and documentation, mentor team members, and participate in the broader Toboggan community.

Benefits

  • Home office/technology budget
  • Yearly professional development budget
  • Company matching RRSP after 1 year
  • 100% employer-paid health & dental insurance including a yearly bank of coverage for complementary medicine (Acupuncture, osteopathy, massage therapy, naturopathy, psychology, etc.)
  • Life, long & short-term disability insurance
  • Parental leave top-up (8 weeks), available to employees with 1+ year of tenure, regardless of path to parenthood.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service