Site Reliability Engineer

Happyrobot Inc.San Francisco, CA
1d

About The Position

We're looking for a Site Reliability Engineer to take the lead on scaling our operational resilience as we grow. You’ll own the stability, observability, and debugging workflows that keep our systems running smoothly. You'll be the go-to person for untangling complex failures in real time, designing tools that turn chaos into clarity, and helping us shift from reactive to proactive operations. This is a high-impact, high-trust role where you’ll shape how reliability is done - reducing incident load, building internal tooling, and directly improving developer focus and system uptime. If you love getting to the root of hard problems and making systems (and teams) stronger, this is your moment.

Requirements

  • 3+ years of hands-on experience debugging production systems (logs, traces, incidents, etc.)
  • Strong problem-solving skills and ability to dive into unfamiliar backend codebases
  • Comfort with Python and Go for reading code and writing small tools/utilities
  • Familiarity with observability and monitoring tools (e.g., Datadog, Prometheus, Sentry)
  • Clear, calm communication under pressure — especially during live incidents

Nice To Haves

  • Experience working with distributed systems or services at scale
  • Built or maintained internal tooling for on-call teams or reliability workflows
  • Familiarity with deployment pipelines, CI/CD, or infra-as-code
  • Experience improving system observability (e.g., custom metrics, traces, log pipelines)

Responsibilities

  • scaling our operational resilience
  • owning the stability, observability, and debugging workflows
  • untangling complex failures in real time
  • designing tools that turn chaos into clarity
  • shifting from reactive to proactive operations
  • reducing incident load
  • building internal tooling
  • improving developer focus and system uptime

Benefits

  • Opportunity to work at a high-growth AI startup, backed by top investors.
  • Fast Growth - Backed by a16z and YC, on track for double-digit ARR.
  • Top-Tier Compensation - Competitive salary + equity in a high-growth startup.
  • Ownership & Autonomy - Take full ownership of projects and ship fast.
  • Work With the Best - Join a world-class team of engineers and builders.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service