Site Reliability Engineer

Kaseya CareersToronto, ON
CA$115,000 - CA$130,000Onsite

About The Position

Kaseya is hiring a Site Reliability Engineer to keep our production systems healthy as we scale. You'll own the reliability of services that thousands of MSPs depend on every day. That means defining the SLOs we hold ourselves to, leading incidents when they happen, and building the automation that keeps things stable as we ship. The work is hands on, the on call rotation is real, and the environment runs heavily on AWS. If you treat reliability as a product instead of a chore, you'll fit in well here.

Requirements

  • 4 to 5 years of AWS production experience
  • IaC ownership with Terraform or CloudFormation, including state management
  • AWS ECS production experience (or strong Kubernetes background willing to ramp)
  • Active on call rotation with incidents led and postmortems written
  • Working fluency with SLOs, SLIs, and error budgets in production

Nice To Haves

  • Kubernetes production experience
  • Broader observability tooling (Datadog, Dynatrace, CloudWatch, Elasticsearch/Kibana)
  • Chaos engineering
  • AWS Lambda or serverless workloads
  • Ansible, Chef, or Puppet
  • DevSecOps work (vulnerability scanning, secrets management, SOC2 or ISO 27001)
  • Production database support (RDS, PostgreSQL, MySQL)
  • Open source contributions or public technical portfolio

Responsibilities

  • Set, monitor, and enforce SLOs, SLIs, and error budgets that keep our systems reliable
  • Lead incident response, troubleshooting, and blameless postmortems that produce real fixes
  • Build and maintain automated deployment, configuration management, and infrastructure provisioning using Infrastructure as Code
  • Manage cloud and hybrid infrastructure with Terraform or CloudFormation, balancing cost, scalability, and resilience
  • Improve observability across systems through proactive monitoring, alerting, and dashboards that surface issues early
  • Partner with development teams to bake reliability into the SDLC, including deployment automation, capacity planning, and chaos engineering
  • Cut operational toil through automation, systems that recover themselves, and engineering solutions that scale
  • Support containerized and serverless workloads so they stay highly available and fault tolerant in production
  • Stay current on SRE, cloud, and observability practices and bring what works back to the team

Benefits

  • The expected annual base salary for this role is CAD $115,000 to CAD $130,000.
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service