Senior Site Reliability Engineer

Cognitiv•Bellevue, WA

16h•$160,000 - $210,000•Hybrid

About The Position

Are you ready to revolutionize the advertising industry? At Cognitiv, we are not just another AdTech company—we are industry trailblazers redefining media buying with our Deep Learning Advertising Platform. Since 2015, we have harnessed the power of cutting-edge deep learning technology and data science to transform how brands connect with their customers. Our mission? To bring intelligence to advertising and deliver unparalleled precision, relevance, and impact at scale. With our innovative platform, advertisers enjoy unprecedented flexibility—whether it is activating Dynamic Deals through their preferred DSP, leveraging our managed service DSP, or utilizing our industry-first ContextGPT product. As a part of Cognitiv, you will be at the forefront of AI-driven advertising solutions, driving change and achieving remarkable growth in a rapidly evolving industry. Now, we’re growing! The role We are looking for a Senior Site Reliability Engineer (SRE) to help expand Cognitiv’s global datacenter footprint and elevate our service management practices. This role is highly hands-on and central to our strategy of building a robust, scalable, and cost-efficient hybrid cloud infrastructure. You’ll play a critical part in ensuring our systems meet the needs of a rapidly growing company while driving us toward industry best practices.

Requirements

Experienced Operator – 7+ years in operations, engineering, or SRE with expertise in multi-datacenter deployments
AWS & Networking Expert – Deep knowledge of AWS infrastructure, networking, and service management practices
Technical Builder – Skilled in infrastructure as code and automation, with proficiency in Python and Bash
Independent Driver – Self-starter who thrives on ownership, problem solving, and big-picture thinking
Collaborative Partner – Strong communicator and supportive teammate who works well across functions

Nice To Haves

Hybrid Cloud Experience – Exposure to hybrid cloud/on-prem solutions.
Datacenter Buildouts – Hands-on experience setting up and managing physical datacenters.
Flexibility to Travel – Willingness to travel 1–2 times per quarter for deployments.

Responsibilities

Build & Scale Infrastructure – Design, implement, and maintain infrastructure across a growing footprint of datacenters and hybrid cloud deployments.
Evaluate & Optimize – Assess physical and network architectures, ensuring scalability, reliability, and cost efficiency.
Elevate Reliability – Improve monitoring, incident management, and disaster recovery, including our migration to Datadog.
Own IaC & Automation – Implement infrastructure as code with Terraform and Ansible, using Python and Bash for automation.
Operate at Scale – Monitor and maintain shared infrastructure in our AWS environment, ensuring availability and stability.
Collaborate & Align – Work closely with engineering and product teams to scope projects tightly to core business requirements.
Drive Change – Lead major service management initiatives that shape Cognitiv’s engineering practices.