Senior Site Reliability Engineer

Paxos•Canada, KY

63d

About The Position

As a Senior Site Reliability Engineer, you’ll play a critical role in shaping and scaling the infrastructure that powers our platform. You'll work closely with engineering teams to ensure our systems are reliable, secure, and performant with a strong focus on Kubernetes, AWS services, and infrastructure as code. Your expertise will help drive automation, improve developer velocity, and support the continued growth and maintenance of our cloud-native environment.

Requirements

Bachelor’s degree in Computer Science, Information Technology, or a related field — or equivalent practical experience.
5+ years of experience in Site Reliability Engineering, DevOps, or related infrastructure roles.
Deep expertise in public cloud platforms, especially AWS, with hands-on experience in services like EC2, S3, Lambda, CloudWatch, and IAM.
Strong proficiency with Kubernetes and container orchestration — you’ve run production workloads and understand cluster management, scaling, and troubleshooting.
Extensive experience with Infrastructure as Code (IaC) using tools such as Terraform, Pulumi, or Crossplane.
Solid scripting or programming skills in languages like Python, Bash, or Go, with a strong focus on automation.
Excellent problem-solving and debugging skills, with a systems-thinking mindset.
Strong communicator who thrives in collaborative, remote-first teams.

Nice To Haves

Working knowledge of managed database services like Amazon RDS, Aurora, or PostgreSQL is a plus — but infrastructure is your main game.

Responsibilities

Design, build, and operate scalable, highly available cloud infrastructure primarily on AWS.
Manage and evolve our Kubernetes environments to support the deployment and operation of modern, containerized applications.
Define and implement Infrastructure as Code (IaC) using tools like Terraform, CDK, or Crossplane.
Automate infrastructure provisioning, configuration, maintenance, and monitoring to reduce manual effort and improve reliability.
Apply best practices around security, observability, and cost optimization across infrastructure and services.
Manage and optimize database technologies, with a focus on Amazon RDS and Aurora.
Partner with development teams to
Investigate and resolve incidents, perform root cause analysis, and implement long-term fixes.
Participate in on-call rotations and provide support for critical production systems.
Contribute to SRE best practices, internal tooling, and team knowledge sharing.

Benefits

Paxos offers a competitive total compensation and benefits package, including equity and bonuses based on both your individual performance and company performance.
Eligibility for bonuses is dependent on job level, and actual salary within the range depends on your skills, experience, and qualifications.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume