Staff Site Reliability Engineer, Platform Engineering

Paxos•Canada, KY

41d

About The Position

As a Staff Site Reliability Engineer, you’ll serve as a technical leader and architect within the Platform Engineering team - shaping the design, reliability, and scalability of Paxos’ next-generation infrastructure platform. You’ll lead initiatives that define how Paxos builds, operates, and scales its cloud systems - from Infrastructure as Code and Kubernetes automation to observability and multi-region resilience. Your work will directly influence how we meet our commitments to customers, partners, and regulators by ensuring our platform is secure, compliant, and resilient by design.

Requirements

Bachelor’s degree in Computer Science, Information Technology, or a related field — or equivalent practical experience.
8+ years of experience in Site Reliability Engineering, DevOps, or related infrastructure roles.
Deep expertise in public cloud platforms, especially AWS, with hands-on experience in services like EC2, S3, Lambda, CloudWatch, and IAM.
Strong proficiency with Kubernetes and container orchestration — you’ve run production workloads and understand cluster management, scaling, and troubleshooting.
Extensive experience with Infrastructure as Code (IaC) using tools such as Terraform, Pulumi, or Crossplane.
Solid scripting or programming skills in languages like Python, Bash, or Go, with a strong focus on automation.
Excellent problem-solving and debugging skills, with a systems-thinking mindset.
Strong communicator who thrives in collaborative, remote-first teams.

Nice To Haves

Working knowledge of managed database services like Amazon RDS, Aurora, or PostgreSQL is a plus — but infrastructure is your main game.

Responsibilities

Architect, build, and operate resilient, scalable, and self-healing cloud infrastructure on AWS.
Lead the evolution of Kubernetes and platform services to enable secure, automated, and multi-region operations.
Define and enforce Infrastructure as Code (IaC) standards using Terraform, AWS CDK, and Crossplane to ensure consistency, security, and auditability.
Drive automation across provisioning, configuration, and monitoring pipelines to reduce manual effort and operational risk.
Establish and champion reliability, observability, and performance standards across Tier-1 services, ensuring alignment with regulatory and partner requirements .
Partner with product engineering to enhance CI/CD velocity, service resilience, and visibility through shared tooling, SLOs, and platform patterns.
Lead incident reviews, root-cause analyses, and systemic reliability improvements, embedding learnings into runbooks and design practices.
Optimize cloud infrastructure for cost, performance, and fault tolerance, driving data-driven operational excellence.
Mentor and upskill engineers, shaping architectural direction and influencing design decisions across multiple teams.
Contribute to the technical strategy and roadmap for Paxos’ infrastructure platform, aligning platform scalability with business growth and compliance objectives.

Benefits

Paxos offers a competitive total compensation and benefits package, including equity and bonuses based on both your individual performance and company performance.
Eligibility for bonuses is dependent on job level, and actual salary within the range depends on your skills, experience, and qualifications.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume