Site Reliability Engineer (GKE) - Draper, UT

ProdataKey
$125,000Onsite

About The Position

We are seeking a hands-on Site Reliability Engineer to join our team. You will bridge the gap between software development and infrastructure operations, treating operational challenges as engineering problems. By leveraging automation, designing resilient distributed systems, and championing observability, you will ensure that our customers can secure and manage their access control systems without friction or failure.

Requirements

  • Bachelor’s or Master’s degree in Computer Science, Information Systems, or related fields or equivalent experience
  • 3+ years of experience in an SRE, DevOps, or Infrastructure Engineering role supporting production cloud environments
  • Deep hands-on experience with major cloud providers (AWS or GCP) and a strong command of containerization and orchestration technologies (Docker and Kubernetes).
  • Strong programming proficiency in languages such as Python, Go, TypeScript, or Bash for automation, internal tooling, and system integrations.
  • Solid fundamentals in Linux/Unix administration, networking protocols (TCP/IP, DNS, HTTP/S, load balancing), and cloud security best practices.
  • A passionate problem-solver who prioritizes automation over manual operations and thrives in high-ownership environments.
  • Must pass drug and criminal background check
  • Work well in an onsite team environment

Nice To Haves

  • Experience with multi-region deployments, failover strategies, and data consistency
  • Experience managing high-throughput message queues or data streaming platforms, specifically RabbitMQ or Apache Kafka.
  • Experience operating production systems at scale
  • Familiarity with modern data stack environments, such as Snowflake or relational databases in a self hosted environment.
  • Relevant industry certifications such as Certified Kubernetes Administrator (CKA) or AWS Certified DevOps Engineer Professional.
  • Familiarity with regulatory requirements (SOC2, GDPR, etc)
  • Experience in physical security or access control systems
  • Familiarity with GCP ecosystem and tooling
  • Experience working in a scaling startup environment

Responsibilities

  • Design, build, and maintain scalable, secure multi-tenant cloud infrastructure using Infrastructure as Code (IaC) principles.
  • Own the availability, latency, performance, and capacity planning of the pdk.io platform and its supporting backend microservices.
  • Develop and manage robust monitoring, logging, and alerting systems to gain deep visibility into cloud infrastructure, API health, and IoT endpoint performance.
  • Participate in a collaborative on-call rotation. Lead rapid incident response mitigation and drive rigorous, blameless post-mortems to ensure long-term system resilience.
  • Optimize and secure automated deployment pipelines to enable developers to ship code to production safely and efficiently.
  • Partner closely with backend developers and hardware engineering teams to define Service Level Indicators (SLIs), Service Level Objectives (SLOs), and manage error budgets.
  • Contribute to technical documentation and knowledge sharing.

Benefits

  • Competitive salary starting at $125,000 depending on experience
  • Comprehensive medical, dental, and vision coverage
  • 401(k) with company match
  • 3-5 weeks PTO annually based on tenure
  • Paid company holidays
© 2026 Teal Labs, Inc
Privacy PolicyTerms of Service