Site Reliability Engineer Jobs

803 jobs found — updated daily

About The Position

The Payward Services (PWS) business unit powers Kraken's B2B and institutional product suite, serving external partners and institutional clients under contractual SLAs. As a Senior SRE, you will partner with PWS development and operations teams to manage infrastructure, improve CI/CD pipelines, and support operational excellence. You will bring expertise in infrastructure, monitoring, and automation to ensure performant, resilient, and continuously improving services.

Requirements

  • 5+ years in DevOps or SRE role
  • Proficiency with hybrid-cloud infrastructure environments
  • Git source version-control and CI/CD configuration proficiency
  • Deep understanding of monitoring and alerting systems, preferably Prometheus and Grafana
  • Ability to debug complex distributed systems, networks, and Linux operating systems issues
  • Containerization and orchestration experience (Docker, Nomad, Kubernetes a plus)
  • Strong scripting skills (Bash, Python, or Go)
  • Self-starter capable of thriving independently and remotely in fast-paced environments

Nice To Haves

  • Background working with distributed systems and technologies (Kafka, gRPC, Redis, etc.)
  • Experience operating services with external SLAs or in a B2B/enterprise context
  • Experience with benchmarking, performance tuning, and identifying system bottlenecks
  • Proficiency with databases (SQL and NoSQL) and production operations experience
  • Interest in lower-level programming languages such as Rust
  • Experience integrating with APIs (GitLab, Jira, Slack)

Responsibilities

  • Manage and support infrastructure for Payward Services, including Nomad, Kubernetes, databases, and 3rd party system integration
  • Provide operational support across multiple teams, helping debug issues in staging and production environments
  • Participate in incident response and post-incident reviews to improve system resilience
  • Consult with teams on performance, monitoring, and alerting best practices — with awareness of partner-facing SLA commitments
  • Build tooling, automation, and dashboards to improve observability and empower development teams
  • Maintain and troubleshoot CI pipelines, ensuring reliable and fast build, test, and deployment cycles
  • Collaborate with developers, QA, and product managers to streamline development and release cycles
  • Support a fully distributed team operating across multiple timezones

Career Resources

Build a Resume for Site Reliability Engineer

The resume builder that gets results.

  • Get clear feedback so you look as qualified as you are
  • Align your resume with the job to get further in the process, faster
  • Take the guesswork out of resume writing

Explore Related Job Searches

© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service