Senior Site Reliability Engineer

Adyen•Chicago, IL

About The Position

Adyen provides payments, data, and financial products in a single solution for customers like Facebook, Uber, H&M, and Microsoft - making us the financial technology platform of choice. At Adyen, everything we do is engineered for ambition. For our teams, we create an environment with opportunities for our people to succeed, backed by the culture and support to ensure they are enabled to truly own their careers. The people of Adyen are motivated individuals who tackle unique technical challenges. Site Reliability Engineer We are looking for a Senior Site Reliability Engineer to design, scale, and secure our internal infrastructure. You will bridge the gap between high-level system architecture and deep-dive technical troubleshooting, with a specific focus observability, and high availability.

Requirements

10+ years of experience in high-traffic environments where downtime has a direct financial or operational impact.
Advanced experience managing production Kubernetes clusters and apps using Helm and ArgoCD.
Proficient with Infrastructure as Code (IaC) for provisioning both cloud or on-premise resources, ideally with Terraform.
Hands-on experience with Consul and Vault, HAProxy.
Experience managing and troubleshooting large-scale Mail Transfer Agents (MTAs) and postfix.
Proficiency in one of the following programming languages: Go or Python.

Nice To Haves

Experience managing Next Generation Firewalls (NGFW), ideally Palo Alto GlobalProtect.
Experience managing and maintaining LDAP infrastructure.

Responsibilities

Architect and manage highly available, distributed systems across multiple global data centers with a focus on optimised performance and disaster recovery.
Define and enforce SLOs/SLIs, manage error budgets, and lead post-mortems.
Participate in an on-call rotation, acting as a point of escalation for complex infrastructure outages.
Identify and automate manual operations to effectively reduce toil.
Design and implement multi-layered monitoring strategies (synthetic, blackbox and whitebox) for both on-premise and SaaS tools using tools like Prometheus, Grafana, and ELK.
Act as a technical mentor within the team, facilitating the upskilling of team members across different global regions.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume