Senior DevOps Engineer

Search Atlas•San Francisco, CA

14h•$170,000 - $250,000•Hybrid

About The Position

Search Atlas Group is a bootstrapped, profitable SaaS company and the pioneer of agentic marketing — AI systems that don't just analyze data, they act autonomously on behalf of marketers at scale. From our headquarters in San Francisco, we serve Fortune 500 companies and growth-stage businesses alike, delivering enterprise-grade reliability across terabytes of search data. We've grown to $32M ARR without venture funding, and we're building toward $100M+ by doing more with less: smarter systems, leaner teams, and a relentless focus on outcomes over optics. If you want to work somewhere that moves fast, builds things that matter, and treats its people like adults, this is it. The Opportunity: Own the Infrastructure That Powers Agentic Marketing at Scale Search Atlas runs on a GCP-native infrastructure that processes terabytes of search data daily, serving enterprise clients who expect zero downtime and real-time results. As we scale from $32M to $100M+ ARR, the reliability, security, and performance of our platform infrastructure becomes mission-critical. We're looking for a Senior DevOps Engineer who's ready to own it. This is a hands-on, high-impact role at the center of our engineering organization. You'll manage and scale our Kubernetes clusters, evolve our GitOps and CI/CD workflows, and build the observability systems that keep our platform healthy and our team ahead of issues before clients ever notice them. You'll work shoulder-to-shoulder with backend, frontend, and QA teams — your work enables theirs. The challenge ahead is real and rewarding: harden and scale an infrastructure stack that already handles significant load, while introducing the automation, traceability, and cost discipline that will carry us to the next order of magnitude. If that's the kind of problem that gets you out of bed, we want to talk. You're walking into a cloud-native, fast-moving engineering team that cares deeply about system reliability and automation. Here's what your world looks like: Day-to-Day Work You'll spend your time administering GKE clusters, building and maintaining GitLab CI/CD pipelines, managing infrastructure via Terraform, and extending OpenTelemetry instrumentation across our services. Grafana and Sentry will be your observability tools of choice, and ArgoCD is central to how deployments flow. You'll also participate in on-call rotations and own incident response when production issues arise. Team & Partners You'll collaborate closely with backend, frontend, and QA engineers. DevOps here isn't a siloed function — you're a core partner to the entire engineering org, helping teams ship faster and more reliably. Domain Context You'll be operating infrastructure that directly powers SEO intelligence and autonomous marketing workflows for Fortune 500 clients. Performance, uptime, and data integrity aren't abstract concerns — they're tied directly to client outcomes and business growth. Scale GCP-native infrastructure supporting a wide range of microservices, multiple databases (PostgreSQL, ClickHouse, Elasticsearch), and high-throughput data pipelines — all serving enterprise and growth-stage clients globally.

Requirements

5+ years of experience in DevOps or SRE roles in production environments
Strong proficiency with Kubernetes (GKE preferred) and GitOps workflows using ArgoCD
Deep knowledge of GCP infrastructure and Terraform-based IaC practices
Hands-on experience with OpenTelemetry for distributed tracing and instrumentation
Expertise with Grafana and Sentry for observability, alerting, and monitoring
Operational knowledge of PostgreSQL, Elasticsearch, and ClickHouse
Proven track record of incident response and production troubleshooting under pressure
A clear, collaborative communicator who works well across engineering disciplines
You take ownership, automate everything you can, and treat reliability as a product feature

Nice To Haves

Experience with Helm, Kustomize, or OPA/Gatekeeper
Cloud cost optimization on GCP
Secret management with Vault or GCP Secret Manager
Blue-green or canary deployment experience

Responsibilities

Administering GKE clusters
Building and maintaining GitLab CI/CD pipelines
Managing infrastructure via Terraform
Extending OpenTelemetry instrumentation across our services
Participate in on-call rotations
Own incident response when production issues arise

Benefits

Health Insurance: Fully covered medical (Aetna), 99% covered dental & vision — at no cost to you
PTO: Unlimited PTO (manager-approved; max 6 consecutive days)
Parental Leave: Paid leave for both birthing and non-birthing parents
Personal Development: $100/quarter development budget
Lasik Benefit: Company-paid Lasik eye surgery (eligible after 2 years)
Pet Insurance: Coverage for up to 2 pets via Lemonade (up to $100/month)
Retirement: 401(k) plan through Deel

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume