Staff Devops Engineer

Artisan•San Francisco, CA

56d•Remote

About The Position

We're building AI employees. Not chatbots. Not copilots. Autonomous digital workers that do real jobs. Our first, Ava, is an AI BDR used by hundreds of companies. She researches leads, writes and sends emails in a customer's voice, runs multi-step outbound sequences, manages her own deliverability infrastructure, self-optimizes over time, and handles objections and meeting booking. She's not a tool someone uses. She's a teammate. We're a YC W24 company, have raised $35M+ from investors including Y Combinator, and are at $8M+ ARR. Right now we're building Ava 2.0, a step change in what an AI employee can do. The engineering problems are hard and the surface area is enormous. Artisan runs a platform that manages hundreds of millions of leads, orchestrates autonomous AI agents in real time, sends email at massive scale across thousands of customer mailboxes, and serves a product suite that replaces an entire sales stack (CRM, inbox, dialer, lead database, campaign engine). The infrastructure that makes all of this work reliably is the job. You'll own the full DevOps and infrastructure layer at Artisan. This is a staff-level role where you'll set the foundation, define the practices, and build the systems that everything else depends on. What that looks like concretely: Kubernetes and container orchestration. We run on AWS with Kubernetes at the core. You'll own cluster management, scaling policies, resource optimization, and the overall compute layer that serves AI workloads, data pipelines, and customer-facing product infrastructure. CI/CD and deployment pipelines. We ship constantly. You'll build and maintain the deployment infrastructure that lets engineers release with confidence, multiple times a day. Automated testing, staging environments, rollback strategies, feature flags. Observability and reliability. Monitoring, alerting, logging, tracing. When an AI agent fails to send an email or a pipeline stalls on lead enrichment, the team needs to know immediately and understand why. You'll build the observability layer from the ground up. Email deliverability infrastructure. Ava sends email at scale across thousands of domains and mailboxes. Sender reputation, domain warming, DNS configuration, IP management. The infrastructure layer here is non-trivial and directly impacts customer outcomes. Security and compliance. We handle customer data, email credentials, and business-critical workflows. You'll own security posture across the stack: secrets management, access controls, network policies, vulnerability scanning, and SOC 2 readiness. Cost optimization. AI workloads and data-intensive pipelines at our scale generate real cloud bills. You'll architect for efficiency, right-size infrastructure, and build visibility into where spend goes and why. Developer experience. The best infrastructure is the kind engineers don't have to think about. Local development environments, fast builds, reliable deploys, clear documentation. You'll make the whole team faster.

Requirements

6+ years of experience in DevOps, SRE, or infrastructure engineering, with a track record of owning production systems end to end
Deep expertise with Kubernetes, container orchestration, and cloud-native architecture on AWS
Strong CI/CD experience: you've built pipelines that teams actually trust and use daily
Fluent in infrastructure-as-code (Terraform, Pulumi, or equivalent) and GitOps workflows
Solid understanding of networking, DNS, load balancing, and security fundamentals
Experience with monitoring and observability tooling (Datadog, Grafana, Prometheus, or similar)
Comfortable scripting in Python, Bash, or Go for automation and tooling
Experience operating databases in production (PostgreSQL, Redis, or similar)
Strong opinions on reliability, incident response, and on-call practices, but pragmatic about when to move fast

Nice To Haves

experience with email infrastructure at scale (deliverability, domain/IP management, SMTP)
experience supporting ML/AI workloads in production
startup experience where you built the infrastructure function from scratch

Responsibilities

Kubernetes and container orchestration. You'll own cluster management, scaling policies, resource optimization, and the overall compute layer that serves AI workloads, data pipelines, and customer-facing product infrastructure.
CI/CD and deployment pipelines. You'll build and maintain the deployment infrastructure that lets engineers release with confidence, multiple times a day. Automated testing, staging environments, rollback strategies, feature flags.
Observability and reliability. You'll build the observability layer from the ground up.
Email deliverability infrastructure. You will handle sender reputation, domain warming, DNS configuration, IP management.
Security and compliance. You'll own security posture across the stack: secrets management, access controls, network policies, vulnerability scanning, and SOC 2 readiness.
Cost optimization. You'll architect for efficiency, right-size infrastructure, and build visibility into where spend goes and why.
Developer experience. You'll make the whole team faster.