Site Reliability Engineer, Production Reliability (Remote - Canada)

Yelp, IncVancouver, BC
$135,000 - $185,000Remote

About The Position

Yelp engineering culture is driven by our values: we’re a cooperative team that values individual authenticity and encourages creative solutions to problems. All new engineers deploy working code their first week, and we strive to broaden individual impact with support from managers, mentors, and teams. At the end of the day, we’re all about helping our users, growing as engineers, and having fun in a collaborative environment. Do you want to build and manage scaleable, self-healing, globally-distributed systems? Our Site Reliability engineers keep Yelp fast, available, and growing, connecting users to great local businesses. No matter how many times we get searched, scraped, scanned, spammed, pinged, paged, or queried, we gotta keep our cool - and keep the site running smoothly. We work in both the development and systems worlds, implementing key parts of the core architecture and supporting developers as they try to do the same. We get to tackle interesting challenges that you can only find at the kind of scale that serves over 100 million users per month. You'll work to empower Yelp: spinning up infrastructure should always be a git commit and a code review away, with automation and self-service being at the core of what we do. This opportunity is fully remote and does not require you to be located in any particular area in Canada. We welcome applicants from throughout Canada. We’d love to have you apply, even if you don’t feel you meet every single requirement in this posting. At Yelp, we’re looking for great people, not just those who simply check off all the boxes.

Requirements

  • Mastery of Linux (we use Ubuntu but any distro is fine), with the view of debugging ambiguous OS behaviours!!
  • Command of your favorite modern programming language to appreciate delivering safe and secure services: Python, Typescript, Ruby, Go, Rust, Java, C++, etc.
  • A solid understanding of Internet fundamental technologies in delivering services on the Internet (TCP/IP, HTTP, DNS, etc).
  • Experience with public cloud platforms (we use AWS and GCP, but others are also fine) and related tooling (Terraform, Puppet, Chef, Ansible etc.).
  • Experience with Linux containerisation and orchestration (e.g., Docker, Podman and Kubernetes).
  • Self-motivated to investigate, fix and improve Yelp in an ever changing environment.
  • Leading, Collaborating and Sharing technical activities with global teams
  • Own the total lifecycle of a system.

Responsibilities

  • Bring your curiosity, tenacity and experience.
  • Working with engineers across Yelp in supporting new features and services.
  • Integrating tools to monitor platform stability and performance.
  • Help scale our Kubernetes clusters and AWS-based infrastructure while maintaining our platform's SLOs.
  • Ensure the reliability of Yelp’s primary datastores (MySQL and Cassandra).
  • Troubleshoot site issues using industry-leading tools like Splunk, Grafana, and Prometheus.
  • Automate everything with Python, Puppet, Git, Jenkins, Terraform and more!
  • Develop custom tools, when off-the-shelf solutions don’t work at our scale and contribute upstream to open source projects.
  • Design and implement new systems, tests, and procedures.
  • Participate in light on-call rotations - we have geographically distributed SRE teams for follow-the-sun support, which reduces the need to be on-call 24h a day!
  • Bring your curiosity, tenacity and experience.
  • Working with engineers across Yelp in supporting new features and services.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume

What This Job Offers

Job Type

Full-time

Career Level

Mid Level

Education Level

No Education Listed

Number of Employees

1,001-5,000 employees

© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service