Dev Ops Engineer

Sage CarePalo Alto, CA
2d

About The Position

We’re hiring a hands-on DevOps Engineer to help architect, scale, and harden the infrastructure powering Sage Care’s AI-driven healthcare platform. This role sits at the center of our engineering delivery engine — owning cloud infrastructure, Kubernetes platform reliability, CI/CD pipelines, and Bazel-based build systems. You’ll work closely with backend and AI engineers to ensure our systems are secure, observable, and built to support real-time, production-grade healthcare workloads. As a Series A company experiencing rapid growth, we need someone who can move fluidly between infrastructure architecture, release engineering, and production reliability — and who takes pride in building clean, automated systems that reduce risk and increase developer velocity. This is a high-impact role with meaningful ownership. You won’t just maintain infrastructure — you’ll shape how it evolves as we scale.

Requirements

  • 4–7+ years of DevOps, SRE, or Infrastructure Engineering experience.
  • Strong hands-on experience with GCP and Kubernetes (GKE preferred).
  • Deep experience with Terraform and infrastructure-as-code.
  • Experience building and maintaining CI/CD pipelines.
  • Experience working with Bazel or similar build systems in production environments.
  • Strong understanding of networking, IAM, and cloud security fundamentals.
  • Experience supporting production systems with uptime requirements.

Nice To Haves

  • Experience supporting HIPAA or SOC2 environments.
  • Experience in early-stage or high-growth startups.
  • Familiarity with AI/ML infrastructure or real-time systems.
  • Experience with release engineering best practices in monorepos.

Responsibilities

  • Infrastructure Architecture & Automation Design, build, and maintain cloud infrastructure in GCP using Terraform.
  • Architect and manage Kubernetes (GKE) clusters across dev, staging, and production.
  • Improve networking, IAM, ingress architecture, and environment isolation.
  • Build reusable infrastructure modules and eliminate configuration drift.
  • Ensure infrastructure is scalable, cost-efficient, and production-grade.
  • CI/CD & Deployment Reliability Design and maintain CI/CD pipelines that enable safe, rapid deployments.
  • Own and optimize Bazel build configuration, caching, and reproducibility.
  • Improve build performance and developer velocity within a monorepo environment.
  • Implement safe release strategies (canary, blue/green, rollbacks).
  • Ensure environment parity and reduce deployment-related incidents.
  • Observability & Reliability Implement and improve logging, monitoring, and alerting across services and Kubernetes workloads.
  • Establish SLIs/SLOs and drive reliability improvements.
  • Reduce MTTR through improved visibility and incident response processes.
  • Improve production readiness standards and postmortem practices.
  • Ensure infrastructure supports increasing AI workloads and real-time traffic.
  • Security & Compliance Strengthen IAM, secrets management, and least-privilege access controls.
  • Harden Kubernetes clusters and cloud infrastructure against misconfiguration.
  • Partner with security leadership to support HIPAA and SOC2 compliance requirements.
  • Improve auditability, change tracking, and infrastructure governance.
  • Embed security best practices directly into CI/CD and build workflows.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service