About The Position

As a Staff Site Reliability Engineer, you’ll serve as a technical leader and architect within the Platform Engineering team - shaping the design, reliability, and scalability of Paxos’ next-generation infrastructure platform. You’ll lead initiatives that define how Paxos builds, operates, and scales its cloud systems - from Infrastructure as Code and Kubernetes automation to observability and multi-region resilience. Your work will directly influence how we meet our commitments to customers, partners, and regulators by ensuring our platform is secure, compliant, and resilient by design.

Requirements

  • Bachelor’s degree in Computer Science, Information Technology, or a related field — or equivalent practical experience.
  • 8+ years of experience in Site Reliability Engineering, DevOps, or related infrastructure roles.
  • Deep expertise in public cloud platforms, especially AWS, with hands-on experience in services like EC2, S3, Lambda, CloudWatch, and IAM.
  • Strong proficiency with Kubernetes and container orchestration — you’ve run production workloads and understand cluster management, scaling, and troubleshooting.
  • Extensive experience with Infrastructure as Code (IaC) using tools such as Terraform, Pulumi, or Crossplane.
  • Solid scripting or programming skills in languages like Python, Bash, or Go, with a strong focus on automation.
  • Excellent problem-solving and debugging skills, with a systems-thinking mindset.
  • Strong communicator who thrives in collaborative, remote-first teams.

Nice To Haves

  • Working knowledge of managed database services like Amazon RDS, Aurora, or PostgreSQL is a plus — but infrastructure is your main game.

Responsibilities

  • Architect, build, and operate resilient, scalable, and self-healing cloud infrastructure on AWS.
  • Lead the evolution of Kubernetes and platform services to enable secure, automated, and multi-region operations.
  • Define and enforce Infrastructure as Code (IaC) standards using Terraform, AWS CDK, and Crossplane to ensure consistency, security, and auditability.
  • Drive automation across provisioning, configuration, and monitoring pipelines to reduce manual effort and operational risk.
  • Establish and champion reliability, observability, and performance standards across Tier-1 services, ensuring alignment with regulatory and partner requirements .
  • Partner with product engineering to enhance CI/CD velocity, service resilience, and visibility through shared tooling, SLOs, and platform patterns.
  • Lead incident reviews, root-cause analyses, and systemic reliability improvements, embedding learnings into runbooks and design practices.
  • Optimize cloud infrastructure for cost, performance, and fault tolerance, driving data-driven operational excellence.
  • Mentor and upskill engineers, shaping architectural direction and influencing design decisions across multiple teams.
  • Contribute to the technical strategy and roadmap for Paxos’ infrastructure platform, aligning platform scalability with business growth and compliance objectives.

Benefits

  • Paxos offers a competitive total compensation and benefits package, including equity and bonuses based on both your individual performance and company performance.
  • Eligibility for bonuses is dependent on job level, and actual salary within the range depends on your skills, experience, and qualifications.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service