Senior DevOps Engineer

Search AtlasSan Francisco, CA
14h$170,000 - $250,000Hybrid

About The Position

Search Atlas Group is a bootstrapped, profitable SaaS company and the pioneer of agentic marketing — AI systems that don't just analyze data, they act autonomously on behalf of marketers at scale. From our headquarters in San Francisco, we serve Fortune 500 companies and growth-stage businesses alike, delivering enterprise-grade reliability across terabytes of search data. We've grown to $32M ARR without venture funding, and we're building toward $100M+ by doing more with less: smarter systems, leaner teams, and a relentless focus on outcomes over optics. If you want to work somewhere that moves fast, builds things that matter, and treats its people like adults, this is it. The Opportunity: Own the Infrastructure That Powers Agentic Marketing at Scale Search Atlas runs on a GCP-native infrastructure that processes terabytes of search data daily, serving enterprise clients who expect zero downtime and real-time results. As we scale from $32M to $100M+ ARR, the reliability, security, and performance of our platform infrastructure becomes mission-critical. We're looking for a Senior DevOps Engineer who's ready to own it. This is a hands-on, high-impact role at the center of our engineering organization. You'll manage and scale our Kubernetes clusters, evolve our GitOps and CI/CD workflows, and build the observability systems that keep our platform healthy and our team ahead of issues before clients ever notice them. You'll work shoulder-to-shoulder with backend, frontend, and QA teams — your work enables theirs. The challenge ahead is real and rewarding: harden and scale an infrastructure stack that already handles significant load, while introducing the automation, traceability, and cost discipline that will carry us to the next order of magnitude. If that's the kind of problem that gets you out of bed, we want to talk. You're walking into a cloud-native, fast-moving engineering team that cares deeply about system reliability and automation. Here's what your world looks like: Day-to-Day Work You'll spend your time administering GKE clusters, building and maintaining GitLab CI/CD pipelines, managing infrastructure via Terraform, and extending OpenTelemetry instrumentation across our services. Grafana and Sentry will be your observability tools of choice, and ArgoCD is central to how deployments flow. You'll also participate in on-call rotations and own incident response when production issues arise. Team & Partners You'll collaborate closely with backend, frontend, and QA engineers. DevOps here isn't a siloed function — you're a core partner to the entire engineering org, helping teams ship faster and more reliably. Domain Context You'll be operating infrastructure that directly powers SEO intelligence and autonomous marketing workflows for Fortune 500 clients. Performance, uptime, and data integrity aren't abstract concerns — they're tied directly to client outcomes and business growth. Scale GCP-native infrastructure supporting a wide range of microservices, multiple databases (PostgreSQL, ClickHouse, Elasticsearch), and high-throughput data pipelines — all serving enterprise and growth-stage clients globally.

Requirements

  • 5+ years of experience in DevOps or SRE roles in production environments
  • Strong proficiency with Kubernetes (GKE preferred) and GitOps workflows using ArgoCD
  • Deep knowledge of GCP infrastructure and Terraform-based IaC practices
  • Hands-on experience with OpenTelemetry for distributed tracing and instrumentation
  • Expertise with Grafana and Sentry for observability, alerting, and monitoring
  • Operational knowledge of PostgreSQL, Elasticsearch, and ClickHouse
  • Proven track record of incident response and production troubleshooting under pressure
  • A clear, collaborative communicator who works well across engineering disciplines
  • You take ownership, automate everything you can, and treat reliability as a product feature

Nice To Haves

  • Experience with Helm, Kustomize, or OPA/Gatekeeper
  • Cloud cost optimization on GCP
  • Secret management with Vault or GCP Secret Manager
  • Blue-green or canary deployment experience

Responsibilities

  • Administering GKE clusters
  • Building and maintaining GitLab CI/CD pipelines
  • Managing infrastructure via Terraform
  • Extending OpenTelemetry instrumentation across our services
  • Participate in on-call rotations
  • Own incident response when production issues arise

Benefits

  • Health Insurance: Fully covered medical (Aetna), 99% covered dental & vision — at no cost to you
  • PTO: Unlimited PTO (manager-approved; max 6 consecutive days)
  • Parental Leave: Paid leave for both birthing and non-birthing parents
  • Personal Development: $100/quarter development budget
  • Lasik Benefit: Company-paid Lasik eye surgery (eligible after 2 years)
  • Pet Insurance: Coverage for up to 2 pets via Lemonade (up to $100/month)
  • Retirement: 401(k) plan through Deel
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service