Sr. Devops Engineer II

DoubleVerify•New York, NY

12h•$111,000 - $222,000•Hybrid

About The Position

DoubleVerify is the leading independent provider of marketing measurement software, data and analytics that authenticates the quality and effectiveness of digital media for the world's largest brands and media platforms. DV provides media transparency and accountability to deliver the highest level of impression quality for maximum advertising performance. Since 2008, DV has helped hundreds of Fortune 500 companies gain the most from their media spend by delivering best-in-class solutions across the digital ecosystem, helping to build a better industry. Learn more at www.doubleverify.com. The Opportunity The DevOps Platform team builds and operates the shared infrastructure foundation that powers all of DoubleVerify's engineering teams. We're the team behind the platforms processing billions of events daily—Kubernetes clusters spanning cloud and data centers, streaming and data systems, workflow orchestration, and the observability stack that keeps everything running. As a Senior Platform Engineer, you'll own critical infrastructure that hundreds of developers depend on every day. This isn't about keeping the lights on—it's about building resilient, self-service platforms that make complex distributed systems simple to operate. You'll work with cutting-edge technologies (Kubernetes, Kafka, Aerospike, ArgoCD, Envoy) and have the autonomy to define standards and shape how DV's infrastructure evolves as the company scales. If you love building platforms that other engineers love using, this is your role.

Requirements

5+ years in DevOps, Platform Engineering, or SRE roles operating production infrastructure at scale
3+ years hands-on experience with Kubernetes in production environments (bonus if you've owned/managed a K8s platform)
Strong cloud platform expertise in GCP or AWS (multi-cloud experience valued)
Software engineering mindset - you write code (Python, Go, Bash) to solve infrastructure problems, not just configure tools
Observability-driven troubleshooting - you're comfortable diving into metrics, logs, and traces to diagnose distributed system issues
Platform thinking - you design for reliability, scalability, and developer experience, not just "getting it working"

Nice To Haves

Hands-on experience with Kafka, Aerospike, ArgoCD, or Airflow in production
Background with service mesh technologies (Envoy Gateway, Istio)
Experience with GitOps workflows and infrastructure-as-code (Terraform, Crossplane)
Contributions to open-source platform tooling or CNCF projects
Site Reliability Engineering (SRE) practices and culture

Responsibilities

Own Company-Wide Infrastructure Platforms
Design, deploy, and operate Kubernetes platforms across GCP, AWS, and data center environments that serve as the foundation for 100+ engineering teams
Build and maintain critical shared services: streaming (Kafka), data storage (Aerospike), workflow orchestration (Airflow), and observability (Prometheus, Grafana) that process billions of events with 99.9%+ reliability
Elevate Developer Experience
Create tooling and automation that transforms complex platform operations into simple self-service workflows—empowering developers while maintaining security and stability
Drive CI/CD evolution by building operators, controllers, and management tools that reduce toil and accelerate deployment velocity
Shape Infrastructure Standards
Partner with product teams from day one to ensure new features integrate cleanly and reliably into DV's infrastructure, preventing technical debt before it happens
Define and promote best practices for automation, observability, security, and maintainability that scale across the organization
Lead Technical Projects
Plan and deliver high-impact infrastructure initiatives in collaboration with multidisciplinary DevOps, SRE, and engineering teams across US, Israel, and Europe
Use metrics, logs, and traces to proactively identify and resolve performance bottlenecks, turning insights into lasting improvements