Senior Site Reliability Engineer (Web Apps)

Criteo Corp.•Paris, TX

49d•Hybrid

About The Position

What's the Platform PRE group? The concept of Product Reliability Engineering (PRE) draws inspiration from the principles of SRE. At Criteo, PRE acts as the bridge between Product, Platform Engineering and Infrastructure. The PRE group comprises nine global engineering teams helping R&D design, build, and operate large-scale distributed systems reliably and efficiently. The common objective of the PRE teams is to build the most reliable platform in AdTech. How You'll Make An Impact As Site Reliability Engineer within the PRE WebApps team, you'll work closely with product engineering to improve the reliability of our apps, systems and pipelines and assess where optimization is needed most. You'll tell stories with meaningful monitoring and hopefully never be paged on your on-call rotation because we've worked hard with dev teams to make our platform the most reliable in AdTech. Speaking of on-call; with a group of 7 you're looking at only around 8 weeks in a year, and your time is compensated! You'll learn skills from other team members along the way and have opportunities to teach us! It's perfect for an engineer who likes shipping code and wants to be involved in all aspects of reliability, efficiency & maintainability.

Requirements

You hold a master's or PhD degree in computer science, a related field, or equivalent practical experience.
You have at least 5 years of experience as SRE or Software/DevOps Engineer.
You have significant experience in software development in one or more programming languages, and data structures or algorithms.
You're at ease with designing, analyzing, and troubleshooting large-scale distributed systems,
You have experience working in computing, distributed systems, storage, or networking.
You are used to debug, optimize code, and to automate routine tasks.
You show a systematic problem-solving approach, coupled with effective verbal and written communication skills.

Responsibilities

Engage in and improve the whole lifecycle of services - from inception and design, through to deployment, operation, and refinement.
Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews.
Scale, automate, and evolve systems by pushing for changes that improve reliability, efficiency and performance.
Optimize and maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
Practice incident response and blameless postmortems.
Linux, Kubernetes, .NET Core, C#, Python, Java/Scala/JVM, Prometheus, Grafana, Kibana and more.

Benefits

Ways of working - Our hybrid model blends home with in-office experiences, making space for both.
Grow with us - Learning, mentorship & career development programs.
Your wellbeing matters - Health benefits, wellness perks & mental health support.
A team that cares - Diverse, inclusive, and globally connected.
Fair pay & perks - Attractive salary, with performance-based rewards and family-friendly policies, plus the potential for equity depending on role and level.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume